FEAT: Add FigStep SafeBench multimodal dataset loader by romanlutz · Pull Request #1787 · microsoft/PyRIT

romanlutz · 2026-05-23T02:10:24Z

Summary

Adds a PyRIT seed dataset loader for FigStep (Gong et al., AAAI 2025 Oral) — a typographic-image jailbreak benchmark for vision-language models. The benchmark, SafeBench, contains 500 questions across 10 harmful topics with a 50-question SafeBench-Tiny subset used for the paper's headline experiments.

Paper: https://arxiv.org/abs/2311.05608
Upstream repo: https://github.com/ThuCCSLab/FigStep (MIT licensed, public, no auth required)
Pinned to commit 861b17b3d67887c06ee3534ec65b3012f9becb7

What's included

pyrit/datasets/seed_datasets/remote/figstep_dataset.py — _FigStepDataset with module-level enums FigStepCategory (10 members matching the CSV's category_name casing) and FigStepVariant.
Two attack variants, both behind one loader:
- FigStep (default): single typographic image of the numbered-list rewrite + the original benign carrier text prompt ("The image shows a list numbered 1, 2, and 3, but the items are empty...").
- FigStep-Pro: the GPT-4V/OCR-evasion upgrade. Each question is rendered as 3–7 sub-images plus a longer per-row templated carrier prompt with {benign_sentence} substituted from benign_sentences_without_harmful_phase.csv. Only the tiny subset has pre-cut sub-images upstream, so this variant requires use_tiny=True (loader raises ValueError otherwise).
Group shape per row: SeedObjective(question) + N SeedPrompt(image_path) + SeedPrompt(text), all sharing one prompt_group_id and sequence=0. The original harmful question is preserved as the group objective so scorers can evaluate against it (the visible carrier text alone is benign).
Carrier prompts copied verbatim from upstream src/generate_prompts.py and README.md.
FigStep-Pro sub-images distributed only via sub-figures.zip → downloaded once and extracted to dbdata/seed-prompt-entries/figstep_pro_subfigures_<sha>/. Benign-sentences CSV fetched as raw text (the upstream file has unquoted commas like ,000 that break strict CSV parsing).
Registered in pyrit/datasets/seed_datasets/remote/__init__.py (auto-discovered via SeedDatasetProvider.__init_subclass__).
BibTeX entry gong2025figstep added to doc/references.bib; citation added to the alphabetical prose paragraph in doc/code/datasets/1_loading_datasets.{py,ipynb} and the hidden-citations list in doc/bibliography.md.

Parameters

Parameter	Default	Notes
`use_tiny`	`True`	Paper's headline experiments evaluate on SafeBench-Tiny. `False` loads the full 500-question SafeBench.
`variant`	`FigStepVariant.FIGSTEP`	`FIGSTEP_PRO` requires `use_tiny=True`.
`categories`	`None`	Filter by `FigStepCategory` members (e.g. `[FigStepCategory.FRAUD]`).
`source` / `source_type`	`None` / `"public_url"`	Standard overrides.

Tests

tests/unit/datasets/test_figstep_dataset.py — 31 unit tests, all passing. Mocks _fetch_from_url and image helpers; covers default init, full/tiny URL routing, FIGSTEP_PRO + use_tiny=False ValueError, invalid enum and raw-string rejection, group structure for both variants, per-row metadata, category filter, empty-after-filter ValueError, missing-keys ValueError, failed-image skip, and sub-image discovery (sort / missing dir / wrong prefix).

Live fetch verification

Ran each configuration end-to-end against real upstream with cache=False (mirroring the asserts in tests/end_to_end/test_all_datasets.py):

Configuration	Seeds	Result
`figstep` tiny default	150	PASS (50 × objective + image + text)
`figstep` full SafeBench	1500	PASS
`figstep` tiny + Fraud filter	15	PASS
`figstep_pro` tiny	281	PASS (50 objectives + 181 image pieces + 50 texts; avg ~3.6 sub-images/row)

Known limitation

The existing tests/end_to_end/test_all_datasets.py only parametrizes over each registered provider's default constructor — figstep_pro and use_tiny=False aren't covered automatically. This is the same limitation every multi-config loader in the repo has today; an audit / structural fix is being scoped in a separate session.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…aset-loader # Conflicts: # doc/bibliography.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz and others added 3 commits May 22, 2026 14:56

FEAT: Add FigStep SafeBench multimodal dataset loader

e597f4a

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into romanlutz/figstep-dat…

bc45b52

…aset-loader # Conflicts: # doc/bibliography.md

TEST: Cover FigStep-Pro asset download and benign sentence helpers

647b78d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add FigStep SafeBench multimodal dataset loader#1787

FEAT: Add FigStep SafeBench multimodal dataset loader#1787
romanlutz wants to merge 3 commits into
microsoft:mainfrom
romanlutz:romanlutz/figstep-dataset-loader

romanlutz commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

romanlutz commented May 23, 2026

Summary

What's included

Parameters

Tests

Live fetch verification

Known limitation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant