FEAT: Backfill class-level metadata for all remote seed datasets#1780
Open
romanlutz wants to merge 4 commits into
Open
FEAT: Backfill class-level metadata for all remote seed datasets#1780romanlutz wants to merge 4 commits into
romanlutz wants to merge 4 commits into
Conversation
Adds class-level `tags`, `size`, and `modalities` to all remote seed dataset loaders so they participate in `SeedDatasetFilter` discovery. Pins a recommended tag vocabulary and the 5-condition rule for the special `default` tag in `seed_metadata.py` as a soft contract, and enforces it via a new parametrized coverage test in `test_seed_dataset_provider.py`. Also renames `_SGXSTestDataset`'s non-canonical `multilingual_culture` tag to `multilingual` and drops `default` (the dataset is gated), and gates `_ORBenchBaseDataset` from auto-registration since it is not a usable loader on its own. No runtime behavior or public API changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
behnam-o
approved these changes
May 22, 2026
Contributor
behnam-o
left a comment
There was a problem hiding this comment.
couple of minor comments, but also looks good as is.
…uckets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…taset-metadata # Conflicts: # pyrit/datasets/seed_datasets/remote/comic_jailbreak_dataset.py # pyrit/datasets/seed_datasets/remote/harmbench_multimodal_dataset.py # pyrit/datasets/seed_datasets/remote/visual_leak_bench_dataset.py # pyrit/datasets/seed_datasets/remote/vlsu_multimodal_dataset.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds class-level
tags,size, andmodalitiesto every remote seed dataset loader so they participate inSeedDatasetFilterdiscovery (e.g.SeedDatasetFilter(tags={"default"})). Before this change, only 5 of ~33 remote loaders declared metadata, so the others were silently skipped by metadata-driven filtering.This is the follow-up to the review discussion on #1757 (cc @jsong468) where reviewer asked why some loaders declared these fields and others didn't. Answer: most predate the metadata schema and just hadn't been backfilled.
How
RECOMMENDED_TAGSinpyrit/datasets/seed_datasets/seed_metadata.py. Users can still set custom tags — the metadata parser does not enforce, but a new parametrized coverage test does.defaulttag inline inseed_metadata.py:await loader.fetch_dataset_async()works with no manual setup.privacy,bias,multimodal,multilingual,refusal, andjailbreakDO count.size/tags/modalitiesbased on the loader's docstring, tests, and upstream dataset card. Added inline# N promptscomments next to eachsizeso reviewers can verify the bucket choice locally.multilingual_culturetag on_SGXSTestDatasettomultilingualand dropped itsdefaulttag (SGXSTest is gated on HF)._ORBenchBaseDatasetasshould_register = False(it has no usabledataset_name) and explicitly opted the three OR-Bench leaf classes back in.TestRemoteLoaderMetadataCoverage— a parametrized test that walks every concrete_RemoteDatasetLoadersubclass via auto-registration and asserts: metadata is present,tags/size/modalitiesare non-empty,sizeis inSeedDatasetSizeCategory, andtagsis a subset ofRECOMMENDED_TAGS(catches future typos likemultilingual_culture).Class-level
harm_categoriesis intentionally deferred — per-rowSeedPrompt.harm_categoriesalready labels individual prompts; picking a "broadest" class-level summary is a judgment call better made by domain owners in a focused follow-up.Backfill table
_AegisContentSafetyDataset_AyaRedteamingDataset_BabelscapeAlertDataset_BeaverTailsDataset_CBTBenchDataset_CCPSensitivePromptsDataset_DarkBenchDataset_EquityMedQADataset_ForbiddenQuestionsDataset_HarmBenchMultimodalDataset_HarmfulQADataset_JBBBehaviorsDataset_LibrAIDoNotAnswerDataset_LLMLatentAdversarialTrainingDataset_MedSafetyBenchDataset_MLCommonsAILuminateDataset_MultilingualVulnerabilityDataset_ORBench80KDataset_ORBenchHardDataset_ORBenchToxicDataset_PKUSafeRLHFDataset_PromptIntelDataset_RedTeamSocialBiasDataset_SaladBenchDataset_SimpleSafetyTestsDataset_SorryBenchDataset_SOSBenchDataset_TDC23RedteamingDataset_ToxicChatDataset_TransphobiaAwarenessDataset_VLGuardDataset_VLSUMultimodalDataset_XSTestDataset_SGXSTestDataset(fixed){default, safety, multilingual_culture}Already-tagged (unchanged):
_HarmBenchDataset,_ComicJailbreakDataset,_VisualLeakBenchDataset.No-op
SeedPrompt.harm_categoriesfrozenset/tuplestyle (cosmetic — out of scope)Discussion link
#1757