open-gitagent · shreyas-lyzr · May 25, 2026
diff --git a/agents/fajarsajid__agent-redteam/README.md b/agents/fajarsajid__agent-redteam/README.md
@@ -0,0 +1,73 @@
+# agent-redteam
+
+> *Agentic LLM Red Team Harness — systematically probe AI agent system prompts for adversarial vulnerabilities.*
+
+**Author:** [Fajar Sajid](https://github.com/fajarsajid) · Purdue University  
+**Category:** Security  
+**Model:** Claude (Anthropic)
+
+---
+
+## What It Does
+
+`agent-redteam` is a CLI evaluation tool that uses Claude as an adversarial engine to red-team any AI agent's system prompt. You provide a system prompt; the harness generates realistic attack probes across eight vulnerability categories and returns structured findings with CVSS-like scores, MITRE ATT&CK mappings, and actionable remediation advice.
+
+It's designed for security researchers, AI engineers, and teams that need to validate agent safety constraints before deployment.
+
+---
+
+## Key Capabilities
+
+| Capability | Detail |
+|---|---|
+| **8 attack categories** | Prompt injection (direct + indirect), identity spoofing, credential exfiltration, privilege escalation, goal hijacking, data exfiltration, safety boundary bypass |
+| **MITRE ATT&CK mapping** | Every finding linked to a MITRE technique ID |
+| **CVSS-like scoring** | Float 0.0–10.0 per finding, severity bucket: critical / high / medium / low |
+| **CI/CD integration** | Exits `1` on critical/high; exits `0` on pass — drop straight into a pipeline |
+| **Dual output** | `--output report.md` (human-readable) + `--json findings.json` (SIEM/tooling) |
+| **Zero extra deps** | Only `requests` required — supply-chain security by design |
+| **Reproducible research** | All empirical results and experiment configs included in the repo |
+
+---
+
+## Example Usage
+
+```bash
+git clone https://github.com/fajarsajid/agent-redteam
+cd agent-redteam
+pip install requests
+export ANTHROPIC_API_KEY=sk-ant-...
+
+# Full red team scan
+python redteam.py --prompt examples/orderbot_prompt.txt \
+    --output report.md --json findings.json
+
+# Quick scan (CI mode — exits 1 on critical/high)
+python redteam.py --prompt examples/orderbot_prompt.txt --quiet
+
+# List all attack categories
+python redteam.py --list-categories
+```
+
+---
+
+## Research Findings
+
+This harness was used in a Purdue University study (2025) evaluating LLM agent safety:
+
+- **49.5%** mean violation rate across attack categories
+- **Indirect injection** caused violations at 70.8% vs 54.2% for direct injection
+- **Multi-turn interactions** increased violation rate from 45.8% (1-turn) to 77.1% (7-turn)
+- **Context drift** is the most dangerous failure mode in production agentic systems
+
+---
+
+## Compliance / Safety
+
+- `human_in_the_loop: destructive` — findings are advisory; human operators decide remediation
+- `audit_logging: true` — all probe/finding data is structured for downstream audit
+- Never autonomously modifies the target agent or its deployment
+
+---
+
+*See the [full research paper](https://github.com/fajarsajid/agent-redteam/blob/main/paper.pdf) for methodology, results tables, and implications.*
diff --git a/agents/fajarsajid__agent-redteam/metadata.json b/agents/fajarsajid__agent-redteam/metadata.json
@@ -0,0 +1,15 @@
+{
+  "name": "agent-redteam",
+  "author": "fajarsajid",
+  "description": "CLI red team harness that probes AI agent system prompts for vulnerabilities: prompt injection, identity spoofing, credential exfiltration, and safety bypass — powered by Claude.",
+  "repository": "https://github.com/fajarsajid/agent-redteam",
+  "path": "",
+  "version": "1.0.0",
+  "category": "security",
+  "tags": ["red-team", "llm-security", "adversarial", "prompt-injection", "ai-safety", "claude", "vulnerability-assessment", "mitre-attack", "ci-cd", "research"],
+  "license": "MIT",
+  "model": "claude-sonnet-4-5-20250929",
+  "adapters": ["claude-code", "system-prompt"],
+  "icon": false,
+  "banner": false
+}