Skip to content

FEAT: Better Scenario Tracking#1758

Open
rlundeen2 wants to merge 4 commits into
microsoft:mainfrom
rlundeen2:users/rlundeen/2026_05_18_scenario_resume
Open

FEAT: Better Scenario Tracking#1758
rlundeen2 wants to merge 4 commits into
microsoft:mainfrom
rlundeen2:users/rlundeen/2026_05_18_scenario_resume

Conversation

@rlundeen2
Copy link
Copy Markdown
Contributor

Previously, if a Scenario was interrupted mid-AtomicAttack, completed AttackResults persisted to the DB became orphaned because the scenario-to-attack-result link only lived in a JSON manifest (attack_results_json) written after the whole AtomicAttack returned. On resume, those objectives re-executed wastefully.

This change makes scenario linkage a first-class column on AttackResultEntry. It allows resume to use more completed results. It also allows for progress to be tracked better.

  • New columns: scenario_result_id (indexed FK, ON DELETE SET NULL) and scenario_data (JSON with fixed schema {atomic_attack_name, objective_index}).
  • New ExecutionAttribution dataclass in pyrit/executor/attack/core/ (so the executor never imports from the scenario layer) is set on AttackContext by AttackExecutor per-task before scheduling, and read by the default attack event handler when persisting.
  • Hydration in get_scenario_results uses the FK with a merge-mode fallback to the legacy manifest for partially-migrated DBs.
  • Resume uses objective_index (deterministic, parallel-safe; derived from seed_groups input_indices) rather than objective text, so duplicate objective text doesn't collapse two seed groups.
  • Drops the unreleased error_attack_result_ids_json column outright; error AttackResults are now linkable via get_attack_results(scenario_result_id=..., outcome=ERROR).
  • attack_results_json stays write-through this release for downgrade safety; future releases will stop populating and then drop.
  • update_scenario_run_state becomes a targeted UPDATE rather than a full row rebuild (so it doesn't clobber the manifest during the deprecation window)

Previously, if a Scenario was interrupted mid-AtomicAttack (Ctrl-C, OOM, crash), completed AttackResults persisted to the DB became orphaned because the scenario-to-attack-result link only lived in a JSON manifest (attack_results_json) written after the whole AtomicAttack returned. On resume, those objectives re-executed wastefully.

This change makes scenario linkage a first-class column on AttackResultEntry:

- New columns: scenario_result_id (indexed FK, ON DELETE SET NULL) and scenario_data (JSON with fixed schema {atomic_attack_name, objective_index}).

- New ExecutionAttribution dataclass in pyrit/executor/attack/core/ (so the executor never imports from the scenario layer) is set on AttackContext by AttackExecutor per-task before scheduling, and read by the default attack event handler when persisting.

- Hydration in get_scenario_results uses the FK with a merge-mode fallback to the legacy manifest for partially-migrated DBs.

- Resume uses objective_index (deterministic, parallel-safe; derived from seed_groups input_indices) rather than objective text, so duplicate objective text doesn't collapse two seed groups.

- Drops the unreleased error_attack_result_ids_json column outright; error AttackResults are now linkable via get_attack_results(scenario_result_id=..., outcome=ERROR).

- attack_results_json stays write-through this release for downgrade safety; future releases will stop populating and then drop.

- update_scenario_run_state becomes a targeted UPDATE rather than a full row rebuild (so it doesn't clobber the manifest during the deprecation window).

Includes Alembic migration with idempotent backfill, scenario_data round-trip on AttackResultEntry, and tests for: event-handler attribution stamping, executor attribution propagation at max_concurrency>1, FK + manifest + mixed hydration paths, migration backfill correctness/idempotency/downgrade, interruption-recovery regression, duplicate-objective-text resume safety, and duplicate atomic_attack_name validation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread pyrit/memory/memory_interface.py Outdated
raise
bool: Always True (no-op success).
"""
print_deprecation_message(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably drop this, it was introduced since last release

…lify hydration

- Delete dishonest no-op add_attack_results_to_scenario shim.
- Standardize on print_deprecation_message (drop ad-hoc warnings.warn).
  Style guide gains a concise Deprecations section with the
  `removed_in = current minor + 2` rule.
- Remove stale per-scenario state left over from the manifest era:
  AttackContext._error_attack_result_id, _StrategyRuntimeError.error_attack_result_id,
  Scenario._result_lock, Scenario._original_objectives_map, and the stray
  `import asyncio`. Replace defensive `getattr(context, '_attribution', None)`
  with direct attribute access — the contract is mandatory.
- Rename ExecutionAttribution -> ScenarioExecutionAttribution (and the
  module file) to match its scenario-specific schema.
- Refactor MemoryInterface.get_scenario_results: split into
  _build_scenario_result_query_conditions, _query_scenario_result_entries,
  _hydrate_scenario_attack_results. The hydrator now issues a single batched
  IN-query on AttackResultEntry.scenario_result_id (fixes the previous N+1)
  and drops the legacy attack_results_json manifest fallback entirely — the
  FK is the sole source of truth.
- Narrow _stamp_attribution(result=) to AttackResult to satisfy ty.
- Update affected tests; rewrite the four hydration tests that incidentally
  relied on the manifest fallback to use the production FK write path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 marked this pull request as ready for review May 19, 2026 20:15
Bumps pyrit/scenario/core/atomic_attack.py coverage from 37% to 94% by
exercising the three resume-critical surfaces the PR introduced or
changed but that had no dedicated tests:

- TestAtomicAttackFilterSeedGroupsByIndices: stable-identity filter that
  drops completed seeds while preserving each survivor's original index
  across successive filter calls.
- TestAtomicAttackFilterSeedGroupsByObjectives: keeps the deprecated
  legacy path under test and asserts the DeprecationWarning fires until
  removed_in=0.16.0.
- TestAtomicAttackAttributionFactory: the closure built in run_async
  when _scenario_result_id is set — no factory outside a Scenario,
  factory maps input_index -> original objective_index after filtering,
  and the snapshot is taken by value so post-call mutations cannot
  poison in-flight attributions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# the resulting AttackResult so it can be located later for hydration and
# resume. Set by AttackExecutor per-task before scheduling. Stays None for
# ad-hoc/direct attack execution outside any scenario.
_attribution: Optional[ScenarioExecutionAttribution] = None
Copy link
Copy Markdown
Contributor

@hannahwestra25 hannahwestra25 May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems weird to me that we're introducing scenarios into attacks here. it's like a circular dependency--scenarios have attacks and know about which attacks they run but then also attacks know about scenarios. And that we have this attribute that is only used in scenarios but is present for all attack strategies.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you; this is my ideal too which is why I built it that way at first. But we need some way to reference. E.g. with many models "comment.post_id" doesn't mean that the "comments know about posts". And I think because attacks are writing to the db, they need to have some way to link back.

But I especially don't like hwo "scenario" is called. I think even if it works the same way, we can rename it to something different. Potentially AttackResultAttribution; I'll try that and see if it looks better even if works in a similar way.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i like AttackResultAttribution a lot more--I think it makes it more extensible (although would it be possible to have an attack result attributed to more than one thing? right now for scenarios no but if we have more orchestrators possibly right? but i think that's future proofing that we might not need to consider) I was trying to think of alternatives and none seemed to make more sense while still being robust enough

Comment thread pyrit/scenario/core/scenario.py Outdated
Decouple the attack persistence path from scenario vocabulary. The attack layer now ships an opaque attribution dataclass (parent_id, parent_collection, position) — the scenario layer interprets those fields to mean (scenario_result_id, atomic_attack_name, objective_index).

- ScenarioExecutionAttribution -> AttackResultAttribution (renamed module and class)

- AttackResult.scenario_result_id / scenario_data -> attribution_parent_id / attribution_data

- AttackResultEntry columns, index, and foreign key constraint renamed; migration 9c8b7a6d5e4f rewritten in place (still unreleased on this branch)

- Replaced FK abbreviation with foreign key / ForeignKey in comments and docstrings

The DB foreign key still targets ScenarioResultEntries.id; that is a relational fact, not a layering violation. The attack layer has no scenario-specific identifiers in its type signatures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
attribution_factory: Optional callable mapping each input index to
a AttackResultAttribution. When provided, the per-task context is
stamped with the attribution so the persistence path can record
scenario linkage.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove scenario wording here

Comment on lines +9 to +13
The attack layer treats this as opaque infrastructure — three string-typed
fields, no scenario semantics. The orchestrator (e.g. ``Scenario``) interprets
them however it likes. Keeping the type in ``executor`` rather than
``scenario`` means the persistence path has no dependency on the
``pyrit.scenario`` package.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove this comment since it relates more to the PR discussion

what they mean and how to query them back later. For example,
``Scenario`` uses ``parent_id`` for the scenario result UUID,
``parent_collection`` for the atomic attack name, and ``position`` for
the original 0-based seed-group index.
Copy link
Copy Markdown
Contributor

@hannahwestra25 hannahwestra25 May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the field definitions are more well defined than this comment makes it seem with the "chooses" wording but I think these fields should be standardized and clear so parent_id is the orchestrator results uuid, and parent_collection is the grouping, etc

but this also makes me question how generic this is. position and parent id make sense but parent_collection seems specific to atomic attacks. can this be folded into the parent_id or excluded here and just included in attribution_data


"""
Add attribution_parent_id (foreign key) + attribution_data (JSON) to
AttackResultEntries; drop ScenarioResultEntries.error_attack_result_ids_json;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be renamed now

)

def _get_completed_objectives_for_attack(self, *, atomic_attack_name: str) -> set[str]:
def _get_completed_objective_indices_for_attack(self, *, atomic_attack_name: str) -> set[int]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns over relying on indices for resume, which seems brittle. The position is only stable if the seed group list is identical between runs, but I don't think there's a guarantee of that. For example, _apply_max_dataset_size uses random.sample with no fixed seed, so a resumed run gets a different sample and the indices silently point to wrong objectives. Even without sampling, if memory.get_seed_groups() doesn't have a deterministic ORDER BY, the list could come back in a different order. There's a comment that calls out the sampling issue ("the very ADO 9012 bug this PR fixes") but only fixes it for that one path.

I know we moved away from objective-text matching because of duplicate objectives colliding, but could we use more of the objective/seed group data to generate a composite hash as the resume key instead? That way reordering or resampling wouldn't silently corrupt the mapping.

self._memory.add_attack_results_to_memory(attack_results=[event_data.result])

@staticmethod
def _stamp_attribution(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: stamp sounds kinda weird imo maybe _apply_attribution ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants