Skip to content
View AgentGymLeader's full-sized avatar

Block or report AgentGymLeader

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AgentGymLeader/README.md

AgentGymLeader 👋

Hey, glad you wandered in. The handle's half a joke and half the actual thesis.

In that world, a Gym Leader is the boss at the end of the gym. But here's the twist — they're on your side: the whole job is to push a team until it's strong and resourceful enough to get past you, to recognize them when their setup genuinely holds up, and to raise the level of the whole region, not just the few who walk in.

My type is AI agent governance. Same shape of work: take a team of agents (Claude, Codex, Gemini), push them until they're trustworthy enough to actually delegate to, and treat "it holds up" as something you build toward — for everyone, not just yourself.

So most of the work happens in the open, lifting the tools people already build on:

All merged contributions, always current »


🔭 The through-line

It all points at one project: PENSO, an operating system for AI organizations built on governance by structure — permission models, separation of execution and audit, and decision records implemented as constraints the system can't bypass, not policy PDFs bolted on afterward. Same idea as the gym: a team earns real autonomy because the structure around it is sound. More at governances.ai.


🤝 More open source & standards

Beyond the headline merges, I work the semantic-convention layer that defines how agent runtimes get described — kept implementation-neutral, with producer-owned context left out of scope:

  • litellm (LLM gateway, ~50k★) — merged PRs »
  • Agent execution-record proposals — review input on keeping the normative contract in the spec itself, rather than in any single reference implementation, so independent implementations interoperate on equal footing
  • otel-agent-evidence-sample — a small reference for the opaque correlation_id evidence-linking pattern (MIT)

🔒 Security hardening (selected)

Memory-safety, injection, and trust-boundary fixes proposed into ML / data-infra OSS — AI-assisted discovery, human-verified before submission. Status shown honestly; most are under maintainer review.

  • MongoDB BSON driver — bounds check for an out-of-range embedded-document length in the BSON C-extension raw-batch path · under review
  • Faiss — guard against integer overflow in index-deserialization size checks · under review
  • AWS SageMaker Python SDK — use the caller's extract_path (not CWD) as the tar-extraction containment base · under review
  • LlamaIndex — fix SQL injection in the MariaDB / DB2 vector stores (sibling of CVE-2025-1793) · under review
  • Feast — warn operators when the registry server starts with authentication disabled · under review

Method: a cross-model find → verify → fix loop (Claude + Codex), every finding adversarially checked before a PR is opened — no exploit code, regression test where the project's suite allows.


🧠 Background

  • 🎓 Tokyo Institute of Technology — Robotics (graduated top of class)
  • 🦅 Human-Powered Aircraft Competition — 1st place, as aircraft architect
  • Robotics background, not CS — running a fully AI-native org. The interesting problem isn't can-build vs. can-deploy; it's can-build vs. can-govern.

🛠️ Stack

Python Claude GitHub Actions FastAPI Google Cloud PostgreSQL

Daily drivers: Claude, Codex, Gemini, Python, GitHub Actions, Cloud Run


💬 Collaboration

Interested in agent governance, human-AI decision systems, or AI org design? Open an issue in this repository to start a conversation.

I don't publish a public email address on GitHub.

Pinned Loading

  1. AgentGymLeader AgentGymLeader Public

    FugoP profile README

  2. otel-agent-evidence-sample otel-agent-evidence-sample Public

    Small OpenTelemetry sample linking agent runtime signals to external evidence via an opaque correlation id.

    Python