Skip to content

Conversation Modes - Admin Created System Prompts - Opt In#411

Open
DerrickF wants to merge 1246 commits into
mainfrom
feature/custom-system-prompts
Open

Conversation Modes - Admin Created System Prompts - Opt In#411
DerrickF wants to merge 1246 commits into
mainfrom
feature/custom-system-prompts

Conversation

@DerrickF
Copy link
Copy Markdown
Contributor

Summary
Adds Conversation Modes: admins manage a catalog of custom system prompts (Guided Learning, Concise, Plan-First, Devil's Advocate, Citation-Strict, Caveman, etc.), users opt in per conversation from the model-settings panel. The active mode is appended to the base system prompt at invocation time and persists across sessions, refreshes, and devices.

Why
Assistants cover the "scoped to a body of knowledge" use case. They don't cover behavioural overlays — how the assistant works on whatever topic you're already discussing. Conversation modes fill that gap:

Topic-agnostic; one mode works across any conversation.
Mid-conversation toggle without losing context.
Admin-curated so the catalogue stays consistent across the institution.
Users see name + description only — prompt text is server-side, never exposed.
What's in the box
For admins — /admin/system-prompts CRUD UI: create, edit, enable/disable, delete prompts. Status defaults to enabled; disabled prompts are hidden from users but kept for audit.

For users — A new "Conversation Mode" radio group in the model-settings panel and a dismissable chip in the chat input. Selection persists per-conversation and survives refresh.

For the inference path — A new system_prompt_resolver module that gates on resume / continuation / preview / assistant-attached turns and appends the prompt with a consistent ## Active Mode: header.

How it works

admin → /admin/system-prompts (CRUD)
└─ DynamoDB: PROMPT# / METADATA

user → /system-prompts (read; enabled-only, name+description only)
→ settings panel selection
└─ frontend signal _activePromptId (bound to session id)
└─ on submit: forwarded as selected_prompt_id on InvocationRequest

inference path
→ resolver: request id > session metadata > none
→ mirrors id back onto session preferences for resume / refresh
→ appends prompt to base system prompt
Selection precedence
Request body (current turn's choice) — handles the first turn of a brand-new session before any metadata row exists.
Session preferences (persisted) — handles resume, refresh, and new-device flows.
Skipped on resume (snapshot owns the prompt), continuation (would invalidate prompt caching), preview (live form edits drive it), and assistant-attached turns (assistants are KB-grounded with their own instructions).

Wire format
Snake_case end-to-end, matching the user_menu_links convention. selected_prompt_id uses null-as-clear semantics on the BFF metadata PUT, leveraging Pydantic's model_fields_set to distinguish "explicitly cleared" from "field omitted — leave unchanged".

Infrastructure changes
New DynamoDB table provisioned by InfrastructureStack: -system-prompts, PAY_PER_REQUEST, point-in-time recovery, AWS-managed encryption.
SSM parameters //admin/system-prompts-table-{name,arn}.
App API IAM: full CRUD on the table.
Inference API IAM: dynamodb:GetItem only.
Tests
32 system_prompts tests — model, repository, service, admin + user routes; covers snake_case wire format, Literal status validation, TOCTOU-safe update.
5 sessions update tests — null-clear, omit-leaves-unchanged, disabled-prompt rejection.
15 resolver tests — gating matrix, composition format, request-id precedence, persistence side-effect, exception swallowing.
666 backend tests pass. Frontend typecheck and Angular build clean.

Manual verification
Create / edit / disable / delete prompts as admin
Pick a mode on the home page; submit; verify prompt is applied to first turn (no metadata round-trip needed)
Pick a mode on /s/; refresh; verify chip rehydrates from persisted preferences
Switch sessions; verify chip resets and doesn't leak across conversations
Dismiss chip; verify next turn doesn't apply the prompt
Disable a prompt admin-side; verify a user with that prompt selected silently falls back (no error)
Known limitations / follow-ups
Concurrent-write window on set_selected_prompt_id: same Read-Modify-Write on the preferences map that update_session_activity already has. Documented in code. Proper fix is nested-attribute SET, which requires pre-creating the preferences map everywhere first — tracked separately.
Sub-200ms clear race: a user who clears a prompt and submits within the BFF persist round-trip can have one stale "mode applied" turn. Bounded blast radius, self-healing, deliberately not fixed by widening the wire protocol.
Pre-existing enabled_tools falsy bug (unrelated): if request.enabled_tools treats [] as "don't update," so a user disabling all tools never persists. Worth a separate ticket.
Operational follow-up after deploy
The temp DynamoDB table created manually before CDK provisioned the real one needs to be deleted post-deploy. Steps:

Confirm -system-prompts exists post-deploy and SSM params resolve.
Smoke-test admin CRUD round-trip end-to-end.
Migrate any seeded prompts from the temp table (small enough to re-create through the UI).
aws dynamodb delete-table --table-name .
Out of scope
Mode stacking (Concise + Code Reviewer, etc.) — single active mode per session for v1.
Role-scoped modes (visible_to_roles) — would let modes be scoped to faculty / students / staff. Worth adding once role data is everywhere it needs to be.
Department-level catalogues — central catalogue only for v1; could grow into per-college / per-department admin scoping later.

colinmxs and others added 30 commits April 8, 2026 11:23
…test skipping

- Enable compaction by default in CompactionConfig
- Increase protected_turns default from 2 to 3
- Add pytest marker to skip integration tests when AGENTCORE_MEMORY_ID is not set
- Fix import path for get_metadata_storage in cache savings tests from metadata_storage.get_metadata_storage to storage.get_metadata_storage
- Ensures integration tests only run in appropriate environments with required AWS credentials
… and cleanup

- Mock AgentCoreMemorySessionManager.initialize() to simulate SDK behavior
- Add _mock_sdk_initialize shim that loads messages and validates agent uniqueness
- Track active patches in fixture scope for proper cleanup on teardown
- Update fixture docstring to document initialize() mocking and message control
- Convert fixture to generator with yield to enable patch cleanup
- Allow tests to control loaded messages via mgr.read_agent and mgr.list_messages
…nsolidation, Trivy supply chain fix (#137)

⚠️ BREAKING CHANGE: Authentication replaced with AWS Cognito.
The legacy generic OIDC implementation has been removed with no
backward compatibility layer. Existing deployments must re-bootstrap.

Cognito First-Boot Authentication:
- Cognito User Pool, App Client, and Domain provisioned in Infrastructure stack
- CognitoJWTValidator replaces GenericOIDCJWTValidator
- New system/ module for first-boot setup, Cognito user/group management
- New cognito_idp_service for federated identity provider CRUD via Cognito IdP APIs
- First-boot page with admin account creation (race-condition-safe DynamoDB writes)
- Frontend auth flow rewritten for Cognito OAuth 2.0 + PKCE
- Runtime-provisioner and runtime-updater Lambda functions removed (2,800+ lines)
- Backend OIDC service, token exchange, and discovery endpoints removed (1,318 lines)
- 2,057 lines of new Cognito test coverage (IdP service, JWT validator, first-boot, system)

RBAC Consolidation:
- Single require_app_roles dependency replaces 6 role-checking functions/decorators
- User roles enriched from stored DynamoDB profile during token processing
- Profile cache invalidation on sync for immediate role updates
- JSON array parsing for custom:roles claim (Entra ID compatibility)
- jwt_role_mappings updates allowed on system_admin role

CORS Unification:
- buildCorsOrigins() shared helper across all 6 CDK stacks
- S3 CORS made conditional, ExposedHeaders→ExposeHeaders fix
- Python APIs read CORS_ORIGINS env var (replaces allow_origins=['*'])

Security:
- Trivy action upgraded v0.28.0→v0.35.0 — old SHA was compromised in
  March 2026 supply chain attack (GHSA-69fq-xp46-6x23)

CI/CD:
- CDK_DOMAIN_NAME and CDK_CORS_ORIGINS added to all workflow jobs
- App API synth-cdk actually skipped on PRs (guard was missing despite beta.20 docs)
- SSM StringParameter creation guarded against empty values

Bootstrap:
- seed_bootstrap_data.py sole owner of RBAC role seeding (removed from app startup)
- system_admin role seeded with jwt_role_mappings=['system_admin']
- Additive JWT mapping seeding for existing deployments

Documentation:
- 54,665 lines of outdated specs and AI artifacts purged (121 files)

Dependencies:
- Python: fastapi 0.135.3, uvicorn 0.44.0, boto3 1.42.83, strands-agents 1.34.1,
  bedrock-agentcore 1.6.0, google-genai 1.70.0, ruff 0.15.9, mypy 1.20.0
- Frontend: Angular 21.2.7, katex 0.16.45, mermaid 11.14.0, Analog.js alpha.26
- Infrastructure: aws-cdk-lib 2.248.0, aws-cdk 2.1117.0, ts-jest 29.4.9
….py (#139)

Create agents/main_agent/config/constants.py with EnvVars, Defaults, and
Prefixes classes. Update all 13 modules to import from the centralized
constants instead of using inline os.getenv() with hardcoded strings.

This eliminates scattered magic strings and provides a single reference
for all configuration. Zero behavior change — all values are identical.

543/543 tests passing.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: centralize env vars and magic strings into config/constants.py

Create agents/main_agent/config/constants.py with EnvVars, Defaults, and
Prefixes classes. Update all 13 modules to import from the centralized
constants instead of using inline os.getenv() with hardcoded strings.

This eliminates scattered magic strings and provides a single reference
for all configuration. Zero behavior change — all values are identical.

543/543 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: extract BaseAgent ABC and ChatAgent from MainAgent

Split MainAgent into a three-tier hierarchy:
- BaseAgent (ABC): shared init for model config, tools, session, streaming
- ChatAgent(BaseAgent): Strands Agent creation and text streaming
- MainAgent(ChatAgent): backward-compatible alias (pass-through)

All existing callers continue to import and use MainAgent unchanged.
The _build_filtered_tools() helper is extracted from _create_agent() for
reuse by future agent types (SkillAgent, VoiceAgent).

543/543 tests passing — zero behavior change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce agent_types.py with a pluggable registry pattern:
- create_agent(agent_type, **kwargs) → BaseAgent subclass
- register_agent_type(name, cls) for dynamic registration
- ChatAgent registered as "chat" by default

Future agent types (skill, voice) will register themselves here.
Existing code is unchanged — MainAgent still works as before.

552/552 tests passing (9 new factory tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement three-level skill architecture adapted from sample-strands-agent:
- Level 1: Lightweight skill catalog injected into system prompt
- Level 2: SKILL.md instructions loaded on-demand via skill_dispatcher
- Level 3: Tool execution via skill_executor

New modules:
- skills/skill_registry.py: Discovers SKILL.md files, binds tools, serves catalog
- skills/skill_tools.py: skill_dispatcher + skill_executor Strands @tool functions
- skills/decorators.py: @Skill() decorator and register_skill() for tool tagging
- skill_agent.py: SkillAgent(ChatAgent) with progressive disclosure override
- skills/definitions/web-search/SKILL.md: Example skill definition

SkillAgent registered as "skill" in agent_types factory.
Existing behavior completely unchanged — SkillAgent is additive only.

590/590 tests passing (38 new skill tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…143)

Implement VoiceAgent(BaseAgent) for bidirectional voice using Nova Sonic 2:
- BidiNovaSonicModel with configurable voice, sample rate, and model
- Voice-text continuity via _load_text_history() from text session
- Separate agent_id ("voice") to prevent session state conflicts
- Voice-optimized system prompt with conversational guidelines
- PyAudio mock for server-side (browser uses Web Audio API)
- Conditional registration — only available with strands-agents[bidi]

Add voice-related constants to config/constants.py (EnvVars + Defaults).
Register "voice" type in agent_types factory.

606/606 tests passing (16 new voice tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement three approval hook categories following the sample-strands-agent
pattern, all using Strands BeforeToolCallEvent:

- EmailApprovalHook: Gates send_email, delete_emails, forward_email, etc.
- ExternalWriteApprovalHook: Gates create_pull_request, deploy, push_code, etc.
- DangerousToolApprovalHook: Gates delete_file, drop_table, execute_sql, etc.

Hooks set _approval_required/_approval_message on the tool_use dict for
the streaming layer to surface to the client for user confirmation.

All hooks registered in BaseAgent._create_hooks() — inherited by all
agent types (ChatAgent, SkillAgent, VoiceAgent).

618/618 tests passing (12 new approval hook tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
)

* feat: add bidi dependency, WebSocket voice route, and test client

Wire up the VoiceAgent for end-to-end testing:

- Add strands-agents[bidi] optional dependency group to pyproject.toml
- Fix BidiAgent/BidiNovaSonicModel import paths (strands.experimental.bidi)
- Create voice_routes.py with WebSocket endpoint at /voice/stream
  - JWT auth from query params (trusted decode, same as invocations)
  - Bidirectional protocol: audio/text input, agent event streaming
  - Debug endpoints: GET /voice/sessions, DELETE /voice/sessions/{id}
- Register voice router in inference API main.py
- Add test_voice_client.py script for manual WebSocket testing

632/632 tests passing (14 new voice route tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: handle CancelledError in VoiceAgent.stop() during teardown

The BidiAgent's Nova Sonic stream teardown can raise CancelledError
when pending AWS SDK futures are cancelled during shutdown. This is
expected behavior, not an error.

- VoiceAgent.stop(): catch CancelledError and Exception from BidiAgent
- voice_routes.py finally block: catch BaseException (CancelledError
  is a BaseException in Python 3.12, escaping except Exception)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: pass session_id and agent_id to list_messages in voice history

AgentCoreMemorySessionManager.list_messages() requires session_id and
agent_id positional args. Pass session_id=self.session_id and
agent_id="default" to read the text chat agent's history for
voice-text continuity. Use the SDK's limit param instead of
post-slicing.

Update tests to verify the correct call signature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use BidiAgent.receive() for voice event streaming

BidiAgent uses receive() as its event source, not stream_async().
Audio/text input is sent via send_audio()/send_text() separately,
and receive() yields typed events (BidiAudioStreamEvent,
BidiTranscriptStreamEvent, etc.) asynchronously.

- VoiceAgent.stream_async(): iterate BidiAgent.receive(), yield
  event.as_dict() for JSON-serializable dicts
- voice_routes._send_to_client(): simplified to handle dicts directly
  since stream_async now yields dicts, not strings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Angular voice components for Nova Sonic bidirectional audio

Frontend voice support with three-layer architecture:

New services (frontend/ai.client/src/app/session/services/voice/):
- pcm-utils.ts: Pure PCM encoding/decoding (Float32↔Int16↔base64)
- AudioRecorderService: Mic capture via Web Audio API → 16kHz PCM chunks
- AudioPlayerService: Gapless base64 PCM playback with interruption support
- VoiceChatService: WebSocket orchestration + state machine
  (idle → connecting → listening → speaking)

Modified components:
- chat-input: Voice toggle button with animated state indicators
  (pulsing red = listening, bouncing green = speaking, spinner = connecting)
- chat-input template: Live transcript overlay during voice mode
- session.page.ts: Wire voice response completions to message list
- MessageMapService: addVoiceMessage() for finalized voice transcripts

TypeScript compiles cleanly (tsc --noEmit). Angular build requires
Node 20.19+ (current machine has 20.18.1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: convert SessionMessage to dict for BidiAgent and fix TS2774

Backend: _load_text_history() now calls .to_dict() on SessionMessage
objects before passing to BidiAgent. Nova Sonic expects plain dicts
with {"role": "...", "content": [...]}, not SessionMessage objects.

Frontend: Fix TS2774 in AudioRecorderService — use typeof check
instead of truthiness check for getUserMedia function detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use to_message() instead of to_dict() for BidiAgent history

SessionMessage.to_dict() wraps the message in metadata:
  {"message": {"role": ..., "content": [...]}, "message_id": 0, ...}

SessionMessage.to_message() returns the plain message dict:
  {"role": "user", "content": [...]}

Nova Sonic's _get_message_history_events expects the plain format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use BidiAgent.send() and receive() APIs correctly

BidiAgent has send(dict) and receive() — not send_audio()/send_text()
or stream_async(). Align VoiceAgent methods with the actual SDK:

- send_audio(): calls self._bidi_agent.send({"type": "bidi_audio_input", ...})
- send_text(): calls self._bidi_agent.send({"type": "bidi_text_input", ...})
- receive_events(): wraps self._bidi_agent.receive() with as_dict() conversion
- stream_async(): now a no-op stub (voice uses receive_events() instead)

Update voice_routes._send_to_client to call receive_events() not stream_async().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement feature X to enhance user experience and optimize performance

* feat: add voice overlay component for voice interactions

- Implemented VoiceOverlayComponent with HTML, CSS, and TypeScript files.
- Added styles for visualizer orb and status badges using Tailwind CSS.
- Integrated voice status management and session handling in the component.
- Enhanced voice chat service to support transcript entries and reveal logic.
- Updated session page to handle voice overlay closure and persist transcripts as messages.
- Introduced configuration constants for voice processing parameters.

* feat: enhance voice agent with real-time cost calculation and metadata handling

* fix: refine token usage handling and improve message processing in voice components

* fix: sanitize user-provided values in log statements to prevent log injection

Addresses CodeQL alert #567 (py/log-injection). All user-provided values
(session_id, user_id, msg_type, enabled_tools) are now passed through
_sanitize_log() which strips newline and carriage return characters before
being interpolated into log messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: update WebSocket voice streaming endpoint for AgentCore compatibility

* fix: ensure config message is required for WebSocket voice stream authentication
…ig (#156)

* feat: update WebSocket voice streaming endpoint for AgentCore compatibility

* fix: ensure config message is required for WebSocket voice stream authentication

* feat: add protocol configuration for HTTP support in InferenceApiStack
…attern (#159)

* fix: align voice WebSocket with reference architecture accept-first pattern

Rewrites voice_stream to match the sample-strands-agent-with-agentcore
reference architecture:

- Accept WebSocket immediately (AgentCore validates auth at proxy layer)
- Extract params via helper functions: custom header → query param → config message
- Config message always read to supplement missing params in cloud mode
- /voice/stream as main route, /ws as alias for AgentCore Runtime
- Frontend uses /voice/stream for local dev, /ws for AgentCore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add missing try block in voice_stream causing IndentationError

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* basic e2e testing (not hooked up to nightly)

* get rid of warnings

* add all home page tests.

* settings and assistants page tests

* Rebase e2e testing branch (#164)

* test(session): update compaction config defaults and fix integration test skipping

- Enable compaction by default in CompactionConfig
- Increase protected_turns default from 2 to 3
- Add pytest marker to skip integration tests when AGENTCORE_MEMORY_ID is not set
- Fix import path for get_metadata_storage in cache savings tests from metadata_storage.get_metadata_storage to storage.get_metadata_storage
- Ensures integration tests only run in appropriate environments with required AWS credentials

* test(session): enhance session manager fixture with initialize() mock and cleanup

- Mock AgentCoreMemorySessionManager.initialize() to simulate SDK behavior
- Add _mock_sdk_initialize shim that loads messages and validates agent uniqueness
- Track active patches in fixture scope for proper cleanup on teardown
- Update fixture docstring to document initialize() mocking and message control
- Convert fixture to generator with yield to enable patch cleanup
- Allow tests to control loaded messages via mgr.read_agent and mgr.list_messages

* Release 1.0.0-beta.22: Cognito-native auth, CORS unification, RBAC consolidation, Trivy supply chain fix (#137)

⚠️ BREAKING CHANGE: Authentication replaced with AWS Cognito.
The legacy generic OIDC implementation has been removed with no
backward compatibility layer. Existing deployments must re-bootstrap.

Cognito First-Boot Authentication:
- Cognito User Pool, App Client, and Domain provisioned in Infrastructure stack
- CognitoJWTValidator replaces GenericOIDCJWTValidator
- New system/ module for first-boot setup, Cognito user/group management
- New cognito_idp_service for federated identity provider CRUD via Cognito IdP APIs
- First-boot page with admin account creation (race-condition-safe DynamoDB writes)
- Frontend auth flow rewritten for Cognito OAuth 2.0 + PKCE
- Runtime-provisioner and runtime-updater Lambda functions removed (2,800+ lines)
- Backend OIDC service, token exchange, and discovery endpoints removed (1,318 lines)
- 2,057 lines of new Cognito test coverage (IdP service, JWT validator, first-boot, system)

RBAC Consolidation:
- Single require_app_roles dependency replaces 6 role-checking functions/decorators
- User roles enriched from stored DynamoDB profile during token processing
- Profile cache invalidation on sync for immediate role updates
- JSON array parsing for custom:roles claim (Entra ID compatibility)
- jwt_role_mappings updates allowed on system_admin role

CORS Unification:
- buildCorsOrigins() shared helper across all 6 CDK stacks
- S3 CORS made conditional, ExposedHeaders→ExposeHeaders fix
- Python APIs read CORS_ORIGINS env var (replaces allow_origins=['*'])

Security:
- Trivy action upgraded v0.28.0→v0.35.0 — old SHA was compromised in
  March 2026 supply chain attack (GHSA-69fq-xp46-6x23)

CI/CD:
- CDK_DOMAIN_NAME and CDK_CORS_ORIGINS added to all workflow jobs
- App API synth-cdk actually skipped on PRs (guard was missing despite beta.20 docs)
- SSM StringParameter creation guarded against empty values

Bootstrap:
- seed_bootstrap_data.py sole owner of RBAC role seeding (removed from app startup)
- system_admin role seeded with jwt_role_mappings=['system_admin']
- Additive JWT mapping seeding for existing deployments

Documentation:
- 54,665 lines of outdated specs and AI artifacts purged (121 files)

Dependencies:
- Python: fastapi 0.135.3, uvicorn 0.44.0, boto3 1.42.83, strands-agents 1.34.1,
  bedrock-agentcore 1.6.0, google-genai 1.70.0, ruff 0.15.9, mypy 1.20.0
- Frontend: Angular 21.2.7, katex 0.16.45, mermaid 11.14.0, Analog.js alpha.26
- Infrastructure: aws-cdk-lib 2.248.0, aws-cdk 2.1117.0, ts-jest 29.4.9

* refactor: centralize env vars and magic strings into config/constants.py (#139)

Create agents/main_agent/config/constants.py with EnvVars, Defaults, and
Prefixes classes. Update all 13 modules to import from the centralized
constants instead of using inline os.getenv() with hardcoded strings.

This eliminates scattered magic strings and provides a single reference
for all configuration. Zero behavior change — all values are identical.

543/543 tests passing.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: extract BaseAgent ABC and ChatAgent from MainAgent (#140)

* refactor: centralize env vars and magic strings into config/constants.py

Create agents/main_agent/config/constants.py with EnvVars, Defaults, and
Prefixes classes. Update all 13 modules to import from the centralized
constants instead of using inline os.getenv() with hardcoded strings.

This eliminates scattered magic strings and provides a single reference
for all configuration. Zero behavior change — all values are identical.

543/543 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: extract BaseAgent ABC and ChatAgent from MainAgent

Split MainAgent into a three-tier hierarchy:
- BaseAgent (ABC): shared init for model config, tools, session, streaming
- ChatAgent(BaseAgent): Strands Agent creation and text streaming
- MainAgent(ChatAgent): backward-compatible alias (pass-through)

All existing callers continue to import and use MainAgent unchanged.
The _build_filtered_tools() helper is extracted from _create_agent() for
reuse by future agent types (SkillAgent, VoiceAgent).

543/543 tests passing — zero behavior change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add agent type registry and create_agent() factory (#141)

Introduce agent_types.py with a pluggable registry pattern:
- create_agent(agent_type, **kwargs) → BaseAgent subclass
- register_agent_type(name, cls) for dynamic registration
- ChatAgent registered as "chat" by default

Future agent types (skill, voice) will register themselves here.
Existing code is unchanged — MainAgent still works as before.

552/552 tests passing (9 new factory tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add progressive skill disclosure system with SkillAgent (#142)

Implement three-level skill architecture adapted from sample-strands-agent:
- Level 1: Lightweight skill catalog injected into system prompt
- Level 2: SKILL.md instructions loaded on-demand via skill_dispatcher
- Level 3: Tool execution via skill_executor

New modules:
- skills/skill_registry.py: Discovers SKILL.md files, binds tools, serves catalog
- skills/skill_tools.py: skill_dispatcher + skill_executor Strands @tool functions
- skills/decorators.py: @Skill() decorator and register_skill() for tool tagging
- skill_agent.py: SkillAgent(ChatAgent) with progressive disclosure override
- skills/definitions/web-search/SKILL.md: Example skill definition

SkillAgent registered as "skill" in agent_types factory.
Existing behavior completely unchanged — SkillAgent is additive only.

590/590 tests passing (38 new skill tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add VoiceAgent with BidiAgent for speech-to-speech interaction (#143)

Implement VoiceAgent(BaseAgent) for bidirectional voice using Nova Sonic 2:
- BidiNovaSonicModel with configurable voice, sample rate, and model
- Voice-text continuity via _load_text_history() from text session
- Separate agent_id ("voice") to prevent session state conflicts
- Voice-optimized system prompt with conversational guidelines
- PyAudio mock for server-side (browser uses Web Audio API)
- Conditional registration — only available with strands-agents[bidi]

Add voice-related constants to config/constants.py (EnvVars + Defaults).
Register "voice" type in agent_types factory.

606/606 tests passing (16 new voice tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add approval hooks for gating dangerous tool operations (#144)

Implement three approval hook categories following the sample-strands-agent
pattern, all using Strands BeforeToolCallEvent:

- EmailApprovalHook: Gates send_email, delete_emails, forward_email, etc.
- ExternalWriteApprovalHook: Gates create_pull_request, deploy, push_code, etc.
- DangerousToolApprovalHook: Gates delete_file, drop_table, execute_sql, etc.

Hooks set _approval_required/_approval_message on the tool_use dict for
the streaming layer to surface to the client for user confirmation.

All hooks registered in BaseAgent._create_hooks() — inherited by all
agent types (ChatAgent, SkillAgent, VoiceAgent).

618/618 tests passing (12 new approval hook tests).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add WebSocket voice route and bidi dependency for VoiceAgent (#145)

* feat: add bidi dependency, WebSocket voice route, and test client

Wire up the VoiceAgent for end-to-end testing:

- Add strands-agents[bidi] optional dependency group to pyproject.toml
- Fix BidiAgent/BidiNovaSonicModel import paths (strands.experimental.bidi)
- Create voice_routes.py with WebSocket endpoint at /voice/stream
  - JWT auth from query params (trusted decode, same as invocations)
  - Bidirectional protocol: audio/text input, agent event streaming
  - Debug endpoints: GET /voice/sessions, DELETE /voice/sessions/{id}
- Register voice router in inference API main.py
- Add test_voice_client.py script for manual WebSocket testing

632/632 tests passing (14 new voice route tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: handle CancelledError in VoiceAgent.stop() during teardown

The BidiAgent's Nova Sonic stream teardown can raise CancelledError
when pending AWS SDK futures are cancelled during shutdown. This is
expected behavior, not an error.

- VoiceAgent.stop(): catch CancelledError and Exception from BidiAgent
- voice_routes.py finally block: catch BaseException (CancelledError
  is a BaseException in Python 3.12, escaping except Exception)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: pass session_id and agent_id to list_messages in voice history

AgentCoreMemorySessionManager.list_messages() requires session_id and
agent_id positional args. Pass session_id=self.session_id and
agent_id="default" to read the text chat agent's history for
voice-text continuity. Use the SDK's limit param instead of
post-slicing.

Update tests to verify the correct call signature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use BidiAgent.receive() for voice event streaming

BidiAgent uses receive() as its event source, not stream_async().
Audio/text input is sent via send_audio()/send_text() separately,
and receive() yields typed events (BidiAudioStreamEvent,
BidiTranscriptStreamEvent, etc.) asynchronously.

- VoiceAgent.stream_async(): iterate BidiAgent.receive(), yield
  event.as_dict() for JSON-serializable dicts
- voice_routes._send_to_client(): simplified to handle dicts directly
  since stream_async now yields dicts, not strings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Angular voice components for Nova Sonic bidirectional audio

Frontend voice support with three-layer architecture:

New services (frontend/ai.client/src/app/session/services/voice/):
- pcm-utils.ts: Pure PCM encoding/decoding (Float32↔Int16↔base64)
- AudioRecorderService: Mic capture via Web Audio API → 16kHz PCM chunks
- AudioPlayerService: Gapless base64 PCM playback with interruption support
- VoiceChatService: WebSocket orchestration + state machine
  (idle → connecting → listening → speaking)

Modified components:
- chat-input: Voice toggle button with animated state indicators
  (pulsing red = listening, bouncing green = speaking, spinner = connecting)
- chat-input template: Live transcript overlay during voice mode
- session.page.ts: Wire voice response completions to message list
- MessageMapService: addVoiceMessage() for finalized voice transcripts

TypeScript compiles cleanly (tsc --noEmit). Angular build requires
Node 20.19+ (current machine has 20.18.1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: convert SessionMessage to dict for BidiAgent and fix TS2774

Backend: _load_text_history() now calls .to_dict() on SessionMessage
objects before passing to BidiAgent. Nova Sonic expects plain dicts
with {"role": "...", "content": [...]}, not SessionMessage objects.

Frontend: Fix TS2774 in AudioRecorderService — use typeof check
instead of truthiness check for getUserMedia function detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use to_message() instead of to_dict() for BidiAgent history

SessionMessage.to_dict() wraps the message in metadata:
  {"message": {"role": ..., "content": [...]}, "message_id": 0, ...}

SessionMessage.to_message() returns the plain message dict:
  {"role": "user", "content": [...]}

Nova Sonic's _get_message_history_events expects the plain format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use BidiAgent.send() and receive() APIs correctly

BidiAgent has send(dict) and receive() — not send_audio()/send_text()
or stream_async(). Align VoiceAgent methods with the actual SDK:

- send_audio(): calls self._bidi_agent.send({"type": "bidi_audio_input", ...})
- send_text(): calls self._bidi_agent.send({"type": "bidi_text_input", ...})
- receive_events(): wraps self._bidi_agent.receive() with as_dict() conversion
- stream_async(): now a no-op stub (voice uses receive_events() instead)

Update voice_routes._send_to_client to call receive_events() not stream_async().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement feature X to enhance user experience and optimize performance

* feat: add voice overlay component for voice interactions

- Implemented VoiceOverlayComponent with HTML, CSS, and TypeScript files.
- Added styles for visualizer orb and status badges using Tailwind CSS.
- Integrated voice status management and session handling in the component.
- Enhanced voice chat service to support transcript entries and reveal logic.
- Updated session page to handle voice overlay closure and persist transcripts as messages.
- Introduced configuration constants for voice processing parameters.

* feat: enhance voice agent with real-time cost calculation and metadata handling

* fix: refine token usage handling and improve message processing in voice components

* fix: sanitize user-provided values in log statements to prevent log injection

Addresses CodeQL alert #567 (py/log-injection). All user-provided values
(session_id, user_id, msg_type, enabled_tools) are now passed through
_sanitize_log() which strips newline and carriage return characters before
being interpolated into log messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: WebSocket voice streaming with AgentCore auth support (#155)

* feat: update WebSocket voice streaming endpoint for AgentCore compatibility

* fix: ensure config message is required for WebSocket voice stream authentication

* feat: WebSocket voice streaming with AgentCore auth and protocol config (#156)

* feat: update WebSocket voice streaming endpoint for AgentCore compatibility

* fix: ensure config message is required for WebSocket voice stream authentication

* feat: add protocol configuration for HTTP support in InferenceApiStack

* fix: include bidi dependency in uv sync commands for Inference API Dockerfile (#157)

* fix: improve AgentCore connection detection in voice stream handling (#158)

* fix: align voice WebSocket with reference architecture accept-first pattern (#159)

* fix: align voice WebSocket with reference architecture accept-first pattern

Rewrites voice_stream to match the sample-strands-agent-with-agentcore
reference architecture:

- Accept WebSocket immediately (AgentCore validates auth at proxy layer)
- Extract params via helper functions: custom header → query param → config message
- Config message always read to supplement missing params in cloud mode
- /voice/stream as main route, /ws as alias for AgentCore Runtime
- Frontend uses /voice/stream for local dev, /ws for AgentCore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add missing try block in voice_stream causing IndentationError

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add Voice Mode to Key Features in README (#160)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: colinmxs <colinmxs@users.noreply.github.com>
Co-authored-by: Colin Smith <7762103+colinmxs@users.noreply.github.com>
Co-authored-by: Phil Merrell <philmerrell@boisestate.edu>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* small testing fix

* add e2e to nightly process

* fix tests / warnings

---------

Co-authored-by: Oscar Filson <OSCARFILSON@boisestate.edu>
Co-authored-by: colinmxs <colinmxs@users.noreply.github.com>
Co-authored-by: Colin Smith <7762103+colinmxs@users.noreply.github.com>
Co-authored-by: Phil Merrell <philmerrell@boisestate.edu>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(connectors): add AgentCore Identity wrapper and Runtime context middleware

First phase of the Connectors refactor, which will eventually replace the
bespoke OAuth token store (OAuthTokenRepository, KMS-encrypted DynamoDB,
Secrets Manager client credentials, manual refresh) with AgentCore Identity's
managed token vault and credential providers.

- AgentCoreContextMiddleware copies the four Runtime headers
  (WorkloadAccessToken, OAuth2CallbackUrl, session ID, request ID) into
  BedrockAgentCoreContext on every invocation. Required because the Inference
  API is a plain FastAPI app rather than BedrockAgentCoreApp, so the SDK does
  not populate the context for us. No-op when headers are absent, so local
  development and unit tests continue to work without mocks.

- AgentCoreIdentityClient wraps IdentityClient.get_token() with a narrower,
  platform-friendly surface for USER_FEDERATION (3LO) flows. Surfaces the
  "user consent required" case as a structured TokenResult(authorization_url=...)
  rather than an exception, so it can flow through the existing SSE stream as
  a new event type in a later phase.

Both modules are pure additions; no existing code path calls them yet.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(connectors): route external MCP OAuth through AgentCore Identity

Wires the Runtime context middleware into the Inference API and swaps the
external MCP client's token source from the bespoke OAuthService to
AgentCore Identity's USER_FEDERATION flow.

- main.py: installs AgentCoreContextMiddleware so WorkloadAccessToken and
  OAuth2CallbackUrl Runtime headers populate BedrockAgentCoreContext on every
  invocation.

- external_mcp_client.py: _get_oauth_token now returns a TokenResult from
  AgentCoreIdentityClient instead of a decrypted token string from
  OAuthService. Scopes are read from the platform's OAuth provider record so
  organizations can change them without code. When the SDK signals that user
  consent is required, the authorization URL is stashed per-user for the
  inference route to surface via an oauth_required SSE event (emitter to
  follow in a subsequent commit). load_external_tools skips client creation
  on consent-required rather than creating a client that would fail at the
  first request.

- Convention: the platform's provider_id is used verbatim as the AgentCore
  Identity credential-provider name. Admins register matching names via
  CreateOauth2CredentialProvider during provider setup.

The OAuthService, token vault, and encryption layer are still referenced by
unrelated code paths (admin routes, connections UI) and will be removed in
Phase 3 once the AgentCore-backed flow is validated end-to-end.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(frontend): rename connections to connectors

Rebrand the user-facing OAuth UI from "connections" to "connectors" for
consistent vernacular across the product. Folders, classes, types, and
route paths all follow the new name; the /settings/connections URL
redirects to /settings/connectors. The backend /oauth/connections
endpoint is preserved as a stable contract and translated at the
service layer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(connectors): add AgentCore credential-provider registrar service

Wraps bedrock-agentcore-control for admin-side OAuth2 credential provider
CRUD: create/update/delete/get with vendor mapping (Google/Microsoft/GitHub
to their native vendors; Canvas/Custom routed through CustomOauth2 via an
OIDC discovery URL or explicit authorization-server metadata). Domain
errors map 404/conflict/invalid-custom to typed exceptions so route
handlers can translate cleanly.

Update is intentionally non-partial: AgentCore's UpdateOauth2CredentialProvider
requires a full oauth2ProviderConfigInput and Get never returns the stored
client_secret, so credential rotation always re-submits both clientId and
clientSecret.

17 unit tests cover every vendor path, error mapping, and the Custom-only
discovery rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(connectors): grant IAM for credential-provider admin ops

Adds Create/Update/Delete/Get/List on bedrock-agentcore OAuth2 credential
providers to the app-api task role, scoped to the default token vault.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(connectors): retire in-house OAuth flow

Deletes the legacy 3LO dance that predates AgentCore Identity — the
per-user token vault, PKCE-based authorization service, encryption layer,
token cache, user-facing /oauth/* routes, and the tool-side OAuthToolService.
AgentCore Identity owns the token vault and consent flow now; the inference
path already routes through agentcore_identity.py via the recent external
MCP client refactor, so these modules had no live consumers.

Also slims shared/oauth/__init__.py to the surviving surface (provider model,
repository, registrar) and unwires the user-facing router from app_api/main.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(connectors): slim OAuth provider model to AgentCore shape

AgentCore Identity owns the clientId, clientSecret, endpoint config, and
callback URL. Our DynamoDB record keeps only the admin metadata (display
name, scopes, role gates, icon) plus cached pointers to AgentCore's record
(credential_provider_arn, callback_url) for convenience.

Drops authorization_endpoint, token_endpoint, authorization_params,
userinfo_endpoint, revocation_endpoint, pkce_required, OAuthUserToken, and
the user-side connection DTOs — all artifacts of the retired in-house flow.
Adds oauth_discovery_url and authorization_server_metadata for Custom/Canvas
providers, gated by a pydantic validator.

Repository surface tightens to put_provider + apply_metadata_update; the
Secrets Manager write/read path is gone. Admin routes (commit next) own
the AgentCore round-trip and hand a fully-formed record to the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(connectors): route admin OAuth CRUD through AgentCore Identity

POST now calls the registrar first and, on success, upserts the metadata
record in DynamoDB. If the DB write fails after AgentCore has accepted
the credentials, we best-effort delete the AgentCore provider to avoid
orphans.

PATCH distinguishes metadata-only edits (scopes, roles, display name,
icon, enabled) from credential rotation. Rotation requires clientId +
clientSecret together — partial updates are rejected by AgentCore's
UpdateOauth2CredentialProvider contract.

DELETE removes the AgentCore provider first (which revokes every user
token stored in its vault), then the local record. Pre-existing connection-
count checks are dropped since per-user tokens no longer live in our DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(connectors): rewire frontend for AgentCore flow

Admin side:
- Rename admin/oauth-providers → admin/connectors (file + route); old
  route path redirects for URL stability
- Rewrite the admin model to the AgentCore-owned shape: drop endpoint
  fields, authorization_params, pkce_required, userinfo/revocation
  endpoints. Add credential_provider_arn, callback_url, and
  oauth_discovery_url / authorization_server_metadata for Custom vendors
- Rewrite the admin form: preset picker simplified to display metadata
  only, Custom requires an OIDC discovery URL, credential rotation
  requires clientId + clientSecret together (AgentCore's update API is
  not partial), success screen after create displays the AgentCore
  callback URL with a copy button so the admin can paste it into the
  vendor console, edit mode shows the callback URL + ARN read-only

User-facing retirement:
- Delete settings/connectors (user "my connected accounts" page),
  settings/oauth-callback (legacy 3LO return handler), and the sidebar
  + route entries for them. AgentCore Identity owns the consent flow
  at runtime via the existing /oauth-complete landing page

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: gitignore .claude/scheduled_tasks.lock

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(connectors): emit oauth_required events + runtime consent UI

When an external MCP tool needs OAuth consent, AgentCore Identity returns
an authorization URL instead of a token. This wires that signal all the
way to the user:

Backend:
- Inference route drains pending consent URLs from the external MCP
  integration after the agent stream finishes and emits one
  oauth_required SSE event per provider before done
- IAM grants bedrock-agentcore:GetResourceOauth2Token on the runtime role
  so the AgentCore Identity client can reach the token vault
- CLAUDE.MD + SSE_ERROR_MESSAGING.md document the new event

Frontend:
- Stream parser recognizes oauth_required and surfaces it as an
  OAuthRequiredEvent
- New /oauth-complete landing page handles the AgentCore callback
  redirect and postMessages consent completion to the opener tab
- OAuthConsentService orchestrates popup opening + postMessage receipt
- OAuthConsentBanner renders the Connect button inside the chat input
- chat-http and assistant preview pass OAuth2CallbackUrl header so
  AgentCore Runtime knows where to return after consent

Also updates the admin Tool form reference from /admin/oauth-providers
to /admin/connectors to match the renamed admin surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(connectors): user-facing settings page + AgentCore consent finalizer

Adds the Settings → Connectors page so users can browse and connect
OAuth-backed external tools end-to-end:

- New /connectors routers on app-api (list user-visible providers via
  RBAC) and inference-api (initiate-consent, complete-consent) — the
  inference-api side runs under the AgentCore Runtime proxy where the
  WorkloadAccessToken context is populated.
- AgentCoreIdentityClient gains a workload-token mint fallback for local
  dev (GetWorkloadAccessTokenForUserId) and appends provider_id to the
  callback URL so the landing page can dismiss the right banner.
- /oauth-complete page POSTs CompleteResourceTokenAuth back through the
  inference-api before notifying the opener, fixing the "consent
  finished but vault stayed empty" race. Uses BroadcastChannel to
  bridge popup → opener under Chrome's COOP isolation.
- New connectors settings page with a Connect / Reconnect affordance
  per provider, wired to the OAuthConsentService popup flow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(connectors): switch oauth gating from pre-flight to mid-turn interrupts

The agent used to pre-flight OAuth at tool-load time and abort the whole
turn if any provider needed consent — the user then had to retype the
prompt after authorizing. This switches to the Strands interrupt
protocol: the consent gate runs lazily before each tool call, pauses
the in-flight turn, and resumes it automatically once the user
finishes the popup.

Backend
- New OAuthConsentHook (BeforeToolCallEvent + AfterToolCallEvent).
  - BeforeToolCall: looks up the OAuth provider for the selected
    MCPAgentTool's MCPClient (no name coupling), checks the in-process
    token cache, and either lets the tool run or calls
    event.interrupt(...) with the consent URL when AgentCore Identity
    reports consent required.
  - AfterToolCall: detects 401-style failures from MCP tool results,
    marks the (user, provider) for force_authentication on the next
    fetch, and sets event.retry = True so the BeforeToolCall hook
    re-fires and triggers a fresh consent. Closes the gap where a
    provider-side revocation leaves a stale token in AgentCore's vault.
- New oauth_token_cache: per-(user, provider) tokens + force-reauth
  flags; lifecycle-managed by the hook.
- ExternalMCPIntegration always loads MCP clients with a lazy
  token_provider that reads from the cache; the pending_consent /
  drain_pending_consent dict and the route's pre-LLM short-circuit
  branch are gone.
- StreamCoordinator emits one oauth_required SSE event per pending
  interrupt before the final done event, carrying interruptId so the
  frontend can resume the same turn.
- ChatAgent.stream_async accepts interrupt_responses and forwards them
  to Strands as the resume prompt; route accepts the same on
  /invocations and skips quota + RAG augmentation on resume.

Frontend
- OAuthRequiredEvent type + validator gain interruptId; settings-page
  consent path makes interruptId optional (no agent turn to resume).
- OAuthConsentService tracks the interruptId per request and invokes a
  registered resume handler on broadcast success.
- ChatRequestService snapshots the last turn's payload and replays it
  with interrupt_responses attached when a consent completes — the
  user never retypes the prompt.

Smoke-tested end-to-end: Google revoke → whoami → 401 → AfterToolCall
detects + retries → fresh consent banner → popup → auto-resume → tool
returns greeting in the same turn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(connectors): bind complete_consent to initiating user + tighten auth-failure regex

Hardens two gaps called out in review of the AgentCore OAuth flow.

- `/connectors/complete-consent` now verifies the submitted `session_uri`
  was issued to the authenticated user at `initiate_consent`, rejecting
  cross-user replay with 403 before ever calling AgentCore. Backed by a
  thread-safe TTL cache (10 min, single-use). Soft-fails with a warning
  when AgentCore's authorize URL doesn't carry a recognised session
  parameter, so an SDK shape change logs rather than blocks.
- `_AUTH_FAILURE_PATTERN` tightened with word boundaries on every clause
  and a non-path guard on `401` so tool errors containing `/v1/401/...`
  no longer trigger a spurious force-reauth.

Also moves `import boto3`/`os` out of the `complete_consent` handler
body and caches the control-plane client via `lru_cache`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(connectors): type-assert AgentCore responses + harden create rollback

Addresses the remaining two critical items from PR #174 review.

Registrar response parsing (`_info_from_response`): fails loudly on
contract violations rather than silently storing empty strings. Missing
`clientSecretArn` still tolerated (some vendors won't persist one) but
a wrong-shape `clientSecretArn` or absent `credentialProviderArn` now
raises TypeError so an AgentCore API change surfaces as a real error.

Admin create-provider rollback (`_rollback_orphaned_provider`): now
retries the AgentCore delete twice with backoff before giving up.
On exhaustion, emits a CloudWatch `Agentcore/OAuth::ProviderOrphaned`
custom metric so ops can alarm on stranded credential providers.
Secondary failures (CW down, registrar down after retries) never
shadow the admin's original 5xx — they only log. The subsequent
create attempt that hits `CredentialProviderConflictError` with no
DB record now returns an actionable 409 pointing at the AWS CLI
cleanup command instead of a bare "already exists".

App API task role grants `cloudwatch:PutMetricData` scoped to the
`Agentcore/OAuth` namespace via a condition key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(connectors): harden oauth consent flow per code review

- Reject non-https authorizationUrls at both intake and open time so a
  compromised backend can't smuggle javascript:/data: URIs into a user
  click.
- Replace window.location.href hijack on popup-block with a blocked
  signal; the banner renders an "Open in new tab" anchor instead of
  tearing down the chat tab.
- Reject resume requests whose interruptIds aren't present in the cached
  agent's _interrupt_state with 400, preventing silent acceptance after
  cache eviction, process restart, or forged payloads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(connectors): drop provider_id from MCP tool load log

CodeQL flagged the provider_id interpolation as clear-text logging of
sensitive data — its taint analysis traces provider_id back through the
OAuth credential path. The provider ID itself isn't secret, but the log
line doesn't need it: tool_id already identifies the tool, and
"(OAuth)" alone confirms auth was wired up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(connectors): remove obsolete pre-flight oauth + required-message tests

Both tests codify behavior that commit b55653d intentionally retired:

- TestInvocationsOAuthRequired exercised drain_pending_consent and the
  route-level oauth_required emission path. That path is gone — consent
  URLs now flow through Strands' _interrupt_state inside
  agent.stream_async (stream_coordinator.py:543), and the hook behavior
  is covered by tests/agents/main_agent/session/test_oauth_consent_hook.py.
- test_missing_message_returns_422 expected message to be required, but
  InvocationRequest.message is now default "" so resume requests can
  reuse the original prompt from interrupt context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(connectors): implement tool-config freshness cache and update related logic

* fix(connectors): resolve OAuth re-auth loop in local dev + tighten 401 detection

Fixes the constant Google re-auth bug: the consent hook was calling
AgentCore Identity with `callback_url=None` whenever the inference API
ran outside the Runtime proxy (every local-dev session). AgentCore then
issued an authorize URL whose redirect went somewhere other than
`/oauth-complete`, so consent never finalized and every request looped
back through the consent flow.

Adds a `CallbackUrlUnavailableError` and an `AGENTCORE_LOCAL_OAUTH_CALLBACK_URL`
env-var fallback in `_resolve_callback_url`, so the failure mode is now
loud instead of silent. Both the chat-triggered consent hook and the
settings-page `initiate-consent` route catch it and return 503 with
actionable guidance.

Also tightens the OAuth 401 detection regex to reduce false-positive
re-auth prompts: `\bunauthorized\b` now requires proximity to an
HTTP/status/code keyword (previously matched prose like "unauthorized
to view this calendar"), and adds high-confidence signals for OAuth
`invalid_grant` (refresh-token revocation) and Google's `UNAUTHENTICATED`
status / `invalid authentication credentials` message.

Drops the in-process `session_cache` defence-in-depth on
`complete-consent`: AgentCore's own `userIdentifier` ↔ `sessionUri`
binding already rejects mismatched completions, and the local cache
cost real operational pain (multi-worker / restart / `--reload` would
break legitimate consent flows with a confusing 403). Trust the
JWT-derived `current_user` plus AgentCore's binding instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(connectors): user disconnect, status endpoint, and OAuth UX polish

Several user-facing connector improvements that share a foundation
(per-user `force_reauth` lifecycle in the in-process token cache):

- New `GET /connectors/{id}/status`: side-effect-free read that the
  settings page uses to render a "Connected" badge without committing
  the user to a consent flow (initiate-consent always triggers a
  server-side pending session). Honors the `force_reauth` flag — a
  just-disconnected user is reported as not connected even if the vault
  still holds an unexpired token.

- New `DELETE /connectors/{id}/connection`: best-effort disconnect that
  flips the local `force_reauth` flag (AgentCore exposes no per-user
  vault-delete API). The next status check returns `connected: false`,
  the next initiate-consent passes `force_authentication=True`, and the
  user re-authorizes from scratch. complete-consent clears the flag on
  success so the UI flips back to connected without waiting on the agent
  loop to warm the cache.

- Frontend Disconnect button on connected rows. Confirmation dialog uses
  the existing `ConfirmationDialogComponent` (CDK Dialog, destructive
  styling) — also swapped the admin connector-list delete from native
  `confirm()` to the same component for visual consistency.

- Closed-popup recovery in `OAuthConsentService`: poll `popup.closed`
  after open and drop the provider from `inFlight` if the user dismisses
  without completing consent. The pending request stays so the chat
  banner re-offers Connect; the settings page resets `awaiting` →
  `idle` via the new `inFlightProviders` signal.

- Settings page: loading skeleton in the row's action area while the
  status probe resolves, dropped the misleading "Reconnect" button
  (clicking it just hit `initiate-consent` and toasted "already
  connected"), and removed the scope-list display under each connector.

- Forward Google's `access_type=offline` (per AgentCore Identity docs)
  via a new vendor-baseline helper, plumbed through both the
  chat-triggered consent hook and the settings/initiate-consent /
  status routes via two new optional lookups on `OAuthConsentHook`
  (`provider_type_lookup`, `custom_parameters_lookup`). Without this
  Google issues a 1-hour access token with no refresh path and the
  vault entry becomes unrefreshable.

- Admin-configurable `custom_parameters` field on the OAuth provider
  record (DynamoDB `customParameters` map, Pydantic Create/Update/
  Response, admin form `key=value` textarea with parse/serialize
  helpers). Merged with the vendor baseline at request time — baseline
  wins on conflict so admins cannot accidentally turn off documented
  vendor requirements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(connectors): add Slack/Salesforce/Zoom presets + dynamic form placeholders

Per the AgentCore Identity supported-providers docs, Slack, Salesforce,
and Zoom are first-class vendors with pre-configured endpoints — admins
only need to supply credentials. Verified the exact `credentialProviderVendor`
strings and `oauth2ProviderConfigInput` keys against the SDK shape
(`Oauth2ProviderConfigInput.members`):

  - Slack       → SlackOauth2      / slackOauth2ProviderConfig
  - Salesforce  → SalesforceOauth2 / salesforceOauth2ProviderConfig
  - Zoom        → ZoomOauth2       / includedOauth2ProviderConfig
                                     (shared key for simpler vendors)

Backend additions: `SLACK`, `SALESFORCE`, `ZOOM` on `OAuthProviderType`;
vendor + config-key entries on the registrar. The existing discovery-URL
guard correctly rejects discovery URLs for these new types.

Frontend additions: matching `ConnectorType` literals; preset entries
with sensible default scopes and vendor-relevant placeholder hints (e.g.
Salesforce `api, refresh_token, offline_access, id, openid`); icon
class branches for the new tiles (Slack fuchsia + chat bubble,
Salesforce sky + cloud, Zoom blue + video camera).

Form polish:

- `scopesPlaceholder` / `customParametersPlaceholder` on each preset.
  Form binds them via computed signals so the hints update as the admin
  switches between providers.
- Selecting a preset seeds `customParameters` only when the preset
  declares `defaultCustomParameters` — avoids clobbering user-typed
  content for presets that have only a hint.
- Dropped the Google `defaultScopes`. The OIDC-only
  `openid email profile` set doesn't actually let an agent do anything
  useful with Google APIs (Calendar/Gmail/Drive each need different
  scopes), so the form lands empty and the placeholder shows the URL
  format as a hint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(connectors): add support for optional base64 icon uploads and validation

* feat(connectors): inline OAuth consent prompt + persistence-backed restore

Replaces the floating OAuth banner with an inline prompt anchored to the
assistant turn that triggered consent, and persists pending interrupts to
session metadata so a browser refresh rediscovers them instead of leaving
the tool call orphaned in `pending` forever.

Backend
- New `PendingInterrupt` model on `apis.shared.sessions.models`; included
  on `MessagesListResponse` and `SessionMetadata`.
- `metadata.add_pending_interrupt` / `remove_pending_interrupts` /
  `get_pending_interrupts` helpers using GSI lookup + targeted UpdateExpression.
- `StreamCoordinator._extract_oauth_required_events` is now async and
  persists each interrupt before yielding the SSE event; failures log but
  never break the live stream.
- `get_messages_from_cloud` fetches pending interrupts in parallel.
- `/invocations` resume path clears resolved interrupts from metadata
  after `agent.stream_async` completes.
- New `DELETE /sessions/{sid}/pending-interrupts/{iid:path}` endpoint
  for explicit dismiss; colon-bearing Strands ids preserved via `:path`.

Frontend
- New `OAuthConsentPromptComponent` with a refined inline card design,
  connector icon (admin base64 wins over heroicon, falls back to
  providerType default), eyebrow/lock motif, primary gradient action
  button, hover-revealed dismiss, fade+slide entrance.
- `MessageMapService.loadMessagesForSession` hydrates pending interrupts
  on session load; anchors to triggering message id when present, else
  the most recent assistant message.
- `OAuthConsentService.openConsentPopup` is async; lazy-fetches a fresh
  authorization URL via `initiate-consent` when the stored one is absent
  or expired (handles "already consented in another tab" by auto-resuming).
- `OAuthConsentService.dismiss` syncs to backend by default; completion
  flow opts out so the resume path's own cleanup isn't double-fired.
- `MessageListComponent` renders unanchored interrupts at end-of-list as
  a fallback for the "partial assistant message wasn't persisted" case.
- `awaiting_auth` derived tool status renders as a primary-blue ring on
  the tool-rail dot instead of an indefinite amber spinner.
- `ChatRequestService.resumeFromOAuthConsent` accepts a fallback session
  id (post-refresh case where `lastRequestObject` is null) and surfaces
  400 `Unknown or expired interrupt ids` as a conversational error.
- Old floating `OAuthConsentBannerComponent` removed.

Known follow-up
- First-turn-of-a-new-session OAuth: persistence currently no-ops because
  the session metadata row doesn't exist yet when the interrupt fires.
  Tracked separately; sidecar item or upsert pattern is the likely fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(connectors): add OAuth consent prompt component for authorization handling

* feat: enhance session metadata management and update handling

- Add functions to ensure session metadata existence and update session title and activity.
- Implement logic for handling session activity updates, including message count increments and preferences merging.
- Introduce deduplication for pending interrupts to prevent duplicate entries during session updates.
- Update frontend components to reflect changes in session management, including OAuth consent prompts and message handling.
- Refactor session service interfaces to use camelCase for consistency with backend responses.
- Enhance tests for session activity updates, pending interrupts, and ensure proper handling of session metadata.

* fix(connectors): durable OAuth resume across browser refresh

Resume after an OAuth-gated tool call only worked when the in-memory
agent cache still held the original turn. After a browser refresh the
frontend lost its request snapshot and the resume request landed with
no enabled_tools / model_id, so the inference API rebuilt a fresh agent
with an empty external-tool registry — the paused tool call had nothing
to resume against and the LLM responded that the tool wasn't available.

Resume contract now lives server-side. On pause, the stream coordinator
captures a ``PausedTurnSnapshot`` (enabled_tools, model_id, provider,
temperature, system_prompt, caching_enabled, max_tokens) onto the
session row alongside the existing ``pendingInterrupts``. On resume,
the inference API loads the snapshot and rebuilds the agent from it;
Strands' SessionManager then restores ``_interrupt_state`` from
AgentCore Memory, so the paused tool call picks up where it left off
regardless of cache hit/miss, refresh, or pod restart.

Frontend ``lastRequestObject`` snapshotting is gone — the resume
payload is now ``{ session_id, message: '', interrupt_responses }``.
Server-side snapshot has a 1h TTL; cleared on full turn completion
and at the start of any new (non-resume) turn.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(connectors): pre-flight external MCP clients so one bad server can't fail a turn

Previously, ``load_external_tools`` cached newly-created MCP clients
without verifying the server was actually reachable. A single connector
that wasn't running locally (or whose endpoint was misconfigured) would
sit in the registry and fail the whole turn the first time Strands
called ``load_tools()`` on it.

Pre-flight each new client immediately after construction. On failure,
log a warning, skip the tool, and continue — the user keeps their other
tools. On success the call also primes the client's tool cache, so
Strands' later ``load_tools()`` becomes a no-op.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: Update oauth consent prompt styling

* test(sessions): unblock route tests on new pre-stream metadata hook

ensure_session_metadata_exists() now runs unconditionally on /invocations
and raises when DYNAMODB_SESSIONS_METADATA_TABLE_NAME is unset, breaking
route tests that mock the agent and skip DynamoDB. Stub it via an autouse
fixture so route tests exercise the route, not the persistence layer. Also
patch the new get_pending_interrupts call in the cloud-message tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(connectors): make OAuth disconnect intent durable across replicas

The disconnect flag lived in a module-level set inside the inference API
process, so a /disconnect on one replica was invisible to any other.
Under multi-replica deploys the user could see "Connected" on one
request and "needs consent" on the next, and the AfterToolCallEvent
401-retry path likewise lost its intent on replica fan-out.

Move the per-(user, provider) disconnect flag to a new
OAuthDisconnectRepository on the existing oauth-user-tokens DynamoDB
table (already provisioned, KMS-encrypted, with R/W IAM granted to the
inference API). The token cache stays as a hot-path L1 for tokens only;
the consent hook reads the disconnect repo on every BeforeToolCallEvent
so a disconnect anywhere is honored on the next tool run anywhere.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(connectors): validate OAuth callback URL header against CORS allowlist

The frontend posts an `OAuth2CallbackUrl` header on every consent-related
request, and the inference-api middleware was forwarding it verbatim into
`BedrockAgentCoreContext`. An authenticated user could pivot the OAuth
redirect to an attacker-controlled origin and capture the authorization
code on consent. Reuse `CORS_ORIGINS` as the trust boundary, pin the
path to `/oauth-complete`, and reject non-http(s) schemes, query strings,
and fragments.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(connectors): cap OAuth re-auth at one retry per provider per turn

A misconfigured provider (wrong scope, perma-401) would otherwise spawn
a fresh consent prompt on every tool call in a turn: the per-tool-use
retry guard reset for each new toolUseId, so the model could trigger
prompt-after-prompt with no upper bound. Track attempted providers on
the hook itself, reset on `BeforeInvocationEvent` (fires per turn,
including resume), so the user sees at most one consent prompt per
provider per turn before 401s flow through to the model.

Also clarify the `event.interrupt(name="oauth:{provider_id}")` comment:
the SDK's BeforeToolCallEvent._interrupt_id folds in `toolUseId`, so
parallel tool calls to the same provider already produce distinct
interrupt ids. New regression test pins that invariant.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(connectors): drop re-emitted oauth_required events by interrupt id

A stream replay after refresh, or a late server-side breadcrumb clear,
could fire the same `oauth_required` event again after a successful
consent or explicit dismissal — and the prompt would resurrect because
provider-keyed dedup re-added the entry. Track seen interrupt ids on
the consent service so already-resolved interrupts stay gone for the
session. New tool calls always carry a fresh interrupt id (Strands
generates it from `toolUseId`), so legitimate prompts are never
suppressed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(infra): correct stale "App API Stack" comments in inference-api stack

The referenced tables live in InfrastructureStack (moved there to break a
prior circular dep); update 9 SSM-read comments to match.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Remove the "Sync from Registry" admin feature in favor of DynamoDB as the
single source of truth for the tool catalog. Code-defined tools are now
seeded by the existing bootstrap script (expanded to cover calculator and
generate_diagram_and_validate); admins add everything else through the
"Add Tool" form. Also drops the in-memory fallback in ToolCatalogService
and removes the stale get_current_weather tool.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* fix: source ToolAccessService catalog from DynamoDB

ToolAccessService.filter_allowed_tools enumerated tools from the
legacy in-memory catalog, so MCP-external and A2A tools added via the
admin form (which only persist to DynamoDB) were silently filtered
out for wildcard-access users.

Wire the service to a new TTL-cached snapshot (freshness.get_all_tool_ids)
backed by the DynamoDB tool catalog. Gateway tools keep their prefix-
based bypass since they're loaded dynamically at runtime. Admin
create/update/delete invalidate the snapshot so changes are visible
on the next chat turn in-process.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: hold all-tool-ids snapshot in single-slot list

CodeQL flagged `_all_tool_ids_cache` as unused because its only writes
were `global` reassignments — flow analysis didn't connect them to the
reads. Switch to a one-element list so the slot is mutated in place,
matching the existing `_cache` dict pattern in this file.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#179)

The first CreateOauth2CredentialProvider call in a region implicitly
provisions the `default` token vault, so the AppApi task role needs
`bedrock-agentcore:CreateTokenVault` in addition to the provider CRUD
actions. Without it, creating the very first connector returned a 500
with `AccessDeniedException` from bedrock-agentcore-control.

Also pass `DYNAMODB_OAUTH_PROVIDERS_TABLE_NAME` to the container env.
The IAM grant and SSM lookup were already in place; only the env wiring
was missing, which caused the OAuth provider repository to silently
disable itself and would have failed the DB write after AgentCore
succeeded — triggering the orphan-rollback path.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
philmerrell and others added 29 commits May 22, 2026 07:35
…#373)

Every file-source endpoint resolves an OAuth token server-side, so app-api
needs the `OAuth2CallbackUrl` header its AgentCoreContextMiddleware bridges
into BedrockAgentCoreContext. FileSourceService omitted it, so browsing a
connected source failed with CallbackUrlUnavailableError (503) right after
a successful connect. Add the header to every call, mirroring
UserConnectorsService.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…omParameters (#374)

The file-source browser surfaced a 409 "not connected" for connectors that
were in fact connected. AgentCore Identity factors `customParameters` into
whether `get_resource_oauth2_token` short-circuits to a vaulted token:
connector consent runs through `initiate_consent`, which sends Google
`prompt=consent`, but every retrieval path omitted it — so AgentCore treated
the read as a fresh request and reported consent-required despite a usable
vaulted token.

`resolve_file_source_token` (and `_is_connected`, which delegates to it) and
`connector_status` now build `customParameters` with `force_authentication=True`
to match the consent flow. The calls remain pure reads — `get_token_for_user`
itself stays `force_authentication=False`.

Frontend hardening so the dialog owns its error UX: file-source requests opt
out of the global error toast via a `SUPPRESS_ERROR_TOAST` HttpContext token,
and a 409 in the browser view now renders an actionable Connect button instead
of dead-end text.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Generated by the kaizen-research skill. Top 5 ideas appended to
docs/kaizen/review-queue.md for the kaizen-review-prep run later this morning.
Generated by kaizen-review-prep. Ranked agenda for the 10-15 min decision pass;
queue updated with three review-prep-surfaced friction items (kaizen-research
did not run 2026-05-22, so no fresh research doc fed this review).
* docs(skills): capture list/form design conventions in tailwind-ui skill

Add references/app-conventions.md documenting the rounded-2xl list and
form page design language (border radius, list style, form sections,
button variants) so future list/form work in frontend/ai.client matches
the redesigned admin pages instead of the older boxed-card style.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(assistants): redesign assistant editor and file-connector UX

Restyle the assistant editor and the file-source browser modal to the
rounded-2xl list/form design language used by the redesigned admin
pages (manage-models, tools), and rework how documents are added:

- Editor form column restyled (rounded-2xl, text-sm/6, blue accent,
  flat border-t sections); uploaded documents rendered as a divide-y
  list; form column given a bg-gray-50 surface so inputs read clearly.
- File-source connectors are surfaced as buttons directly above the
  drop zone instead of behind a generic "Import from a connector"
  button — clicking one opens the browser dialog targeted at that
  connector, skipping the in-modal source picker.
- The drop zone collapses to a compact "Add files" control once a
  document exists or is uploading; device and connector uploads stay
  available.
- File-source browser modal restyled to the same conventions
  (rounded-2xl panel, divide-y lists, convention button variants).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
When a file-source connector is not connected, the editor button now
labels itself "Connect to {connector}" and kicks off the OAuth consent
popup in place — users no longer have to open the modal just to be
presented with a Connect prompt.

- Editor injects UserConnectorsService and OAuthConsentService, mirroring
  the dialog's connect flow.
- Button shows in-place busy states (Starting… / Awaiting consent) and
  disables itself while the popup is in flight.
- On consent success the file-source list is refreshed and the browser
  modal opens automatically into the newly-connected connector, so one
  click takes the user from "not connected" to picking files.
- If the user closes the popup without consenting, the button resets so
  they can try again.

Spec gains UserConnectorsService and OAuthConsentService mocks so the
component can still construct in tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…378)

Adds an "Add web content" flow alongside the existing connector imports in
the assistant editor. Single-page mode (default) and bounded BFS crawl mode
share one dialog; the backend writes extracted markdown to the documents
bucket so the existing S3-event ingestion Lambda chunks and embeds it
exactly as a device upload would.

Backend
- New `apis/app_api/web_sources/` package (models, routes, crawler, repo,
  url_utils). Endpoints under `/assistants/{id}/web-sources/`: `POST /crawl`,
  `GET /crawls?active=true`, `GET /crawls/{id}`. Uses
  `get_current_user_from_session` per the auth-dependency rule.
- BFS crawler: per-host jitter, bounded concurrency, robots.txt-respecting,
  same-domain, SSRF-guarded, 5 MB per-page cap, 15-minute crawl budget,
  always-finalize-on-exit. trafilatura → markdown with BS4 fallback.
- `CrawlJob` rows persisted in the assistants table via the adjacency-list
  pattern (`SK=CRAWL#{crawl_id}`). Floats coerced to `Decimal` before
  put_item (DynamoDB rejects bare floats). Terminal rows get a 30-day TTL
  and cascade-delete when the last web doc for that root is removed.
- Cleanup cascade: `cleanup_document_resources` now reaps orphaned terminal
  `CrawlJob` rows after deleting a web doc.
- Self-heal: `list_active_crawls` auto-finalizes any `running` row older
  than 20 minutes (mirrors the stale-doc auto-fail pattern), so a crashed
  process can't leave the SPA in perma-poll.
- Crawler holds strong refs to worker tasks; the route holds a module-level
  set of in-flight crawl tasks (Python's weak task tracking would otherwise
  GC them mid-execution).
- New deps: beautifulsoup4 4.13.5, trafilatura 2.0.0.

Frontend
- New `WebSourceDialogComponent`: URL input + "Crawl linked pages" toggle
  revealing depth / max-pages / concurrency / delay sliders. Submit-and-watch
  UX — modal closes on Start, pages appear in the docs list as they're
  ingested. Style tokens match the file-source dialog.
- `WebSourceService` thin client for the three endpoints.
- Editor wiring: "Add web content" button next to the connector buttons,
  with an inline "Crawling…" badge while a crawl is in flight.
- Crawl watcher polls `/web-sources/crawls?active=true` every 5 s; new
  pages surface via an incremental discovery merge — no list-wide refresh.
- Document delete: optimistic UI removes the row immediately and rolls
  back on failure (no more wait-then-disappear). Stale-uploading docs can
  now be deleted regardless of polling state.

Tests
- `tests/apis/app_api/web_sources/` (60 tests): URL normalization,
  same-domain, SSRF guard, BFS bounds, robots, per-page failure handling,
  crawl finalization, route 202/404/422/401, CrawlJob put/get/list round
  trip with float delays, stale-row reaper, cascade cleanup, TTL on
  finalize.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…letons (#379)

Collapses the three separate "add knowledge" groups (Add files, Add web
content, connector buttons) into a single inline action row under the
renamed "Knowledge base" section. Order is Add files → Add web content →
connector chips so the most-common action sits first.

- New `fileSourcesLoading` signal renders width-matched skeleton chips
  while the connector catalog loads, so the row's final layout is
  previewed instead of buttons popping in after a network round-trip.
- Drop zone is preserved as the empty-state drag-drop affordance, but
  trimmed to drag-only (the inline "Add files" chip is the only
  click-to-pick entry point now).
- "Crawling…" badge moved up next to the section heading so the action
  row stays clean.
- Fixes a pre-existing duplicate `id="file-upload"` (the drop zone and
  the compact button each rendered their own input with the same id —
  invalid HTML + an a11y violation). Now a single hidden input shared by
  every label that needs it.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds a download button next to the delete button in the assistant
editor's uploaded documents list. Visible only for documents in the
`complete` status. Reuses the existing `GET /assistants/{id}/documents/
{docId}/download` endpoint and mirrors the citation-display download
pattern (presigned URL opened in a new tab).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
)

The assistant editor preview now hides voice mode and the settings button —
neither is wired to anything useful inside the editor — while exposing file
attachments so authors can test their assistant against the same inputs end
users will send. File uploads flow through the existing /chat/stream proxy
as file_upload_ids.

Splits the chat-input settings button into its own showSettingsControl gate
(previously coupled to showFileControls), adds a parallel showVoiceControl,
and threads both through ChatContainerConfig. Renames the chat-input file
input id to chat-input-file-upload to avoid colliding with the assistant
form's knowledge-base file input on the editor page.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
When a consumer chats with an Assistant (request carries rag_assistant_id),
the agent now runs with zero external tools and is steered to answer from
the knowledge base context that's already pre-stuffed into the prompt.
Editor-side connectors are unchanged — owners still curate the KB by pulling
in Drive docs, web crawls, and uploads.

The product split this codifies: the main agent is the general-purpose,
tool-using surface; assistants are the grounded, predictable, citation-
friendly surface. Two clear mental models instead of one muddy one.

Three layers:

* Inference API route (inference_api/chat/routes.py) forces
  input_data.enabled_tools = [] inside the rag_assistant_id branch before
  the agent is built, and warns if the client sent a non-empty list. This
  is the real enforcement chokepoint — covers SPA, API-key, and any future
  caller through /invocations.
* System prompt composition gains a "## Knowledge Base Grounding" section
  inserted between the base prompt and the owner's "## Assistant-Specific
  Instructions". Owner instructions still come last and take precedence;
  the directive just tells the model to ground in provided context and
  acknowledges no external tools are available. Applies in both the
  with-instructions and no-instructions paths.
* The SPA chat-request builder omits tools (sends enabled_tools: []) when
  assistantId is present. Cosmetic given backend enforcement but saves
  payload bytes and matches the existing preview-chat behavior.

MCP App UI events fall out for free: no tools means no tool_result events,
which means no ui_resource events to gate. Existing assistants don't need
migration — beta. Voice and other consumer surfaces will be addressed
separately if/when they need to reach the same contract.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…113) (#383)

Extends sharing to support per-user permission levels so teams can co-edit
assistants. Owners can grant edit access to specific people; editors can
update settings, manage documents, and test-chat — but cannot delete the
assistant, change visibility, or manage the share list.

Backend only — the frontend share UI and per-assistant edit gating land in
a follow-up PR.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* feat(assistants): viewer/editor share permissions UI (#113)

Consumes the backend contract shipped in #383: surfaces a per-share
permission toggle in the share dialog, an "Editor" badge + Edit affordance
on shared-with-me cards, an editor banner on the form, and an owner-only
gate on the Share button.

- Dialog: per-row "Can view / Can edit" select on existing shares, a
  "permission for new people" toggle, and onSave delta-detection that
  distinguishes adds, removes, and permission changes on already-shared
  emails — each dispatched to the correct backend endpoint (POST /
  DELETE / PATCH).
- List: shared-with-me cards with userPermission='editor' now show an
  Editor badge and an Edit button alongside Chat.
- Form: surfaces "Shared by {owner}" banner for editors; Share button
  is owner-only.
- AssistantSharesResponse.sharedWith is now ShareEntry[] across the
  service / api / dialog (matches the backend's PR-1 shape change).
- Vitest: existing service spec migrated to the new shape; new dialog
  spec covers the delta algorithm (adds / removes / permission upgrades /
  mixed) via DI tokens per project convention.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(share-dialog): match redesign tokens + skeleton for user search

Brings the share dialog in line with the list/form design language
captured in .claude/skills/tailwind-ui/references/app-conventions.md:
rounded-2xl, blue accent (was indigo), focus:ring-2 focus:ring-blue-500,
dark:bg-gray-800 inputs, flat <section> blocks divided by border-t.

- Header avatar: blue-100 chip (was indigo).
- URL row + add-people + current-shares are three flat sections, no
  individually-bordered cards.
- "Currently shared with" became a single rounded-2xl divide-y ul
  (was a stack of bordered rows), with an empty-state when nothing
  is shared.
- Mode toggle is now a proper segmented tablist with role="tab" and
  aria-selected, accented in blue.
- Permission default-for-new-people moved into the section header so
  it's contextually attached to the add controls.
- Save/Cancel actions reorder cleanly on mobile + desktop and use the
  shared blue/ghost button tokens.
- Search results render skeleton rows (animate-pulse) while searching,
  with role="status" sr-only text — matches the connector-chip skeleton
  pattern already in the assistant editor.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(share-dialog): add skeleton for the currently-shared list

loadShares() runs in the constructor and shares() starts empty, so the
empty-state ("Not shared with anyone yet") was painting while the fetch
was in flight on dialogs for SHARED assistants. Renders a skeleton ul
matching the real row layout (email + permission select + delete) while
loadingShares() is true; suppresses the count chip until the fetch
resolves.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(share-dialog): single-row shares skeleton + stronger tab accent

- Shares skeleton: drop from 3 rows to 1; the loading state was visually
  heavier than the real list it was previewing.
- Tabs: selected state now flips font-medium → font-semibold alongside
  the existing blue underline + text color, so the active tab reads at a
  glance. -mb-px on each tab makes the active 2px underline overlap the
  container's 1px bottom border cleanly (no gray line peeking through).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style(share-dialog): more right padding on permission selects

The native chevron sat too close to the rounded-2xl edge. Switch
px-2.5 → pl-2.5 pr-7 on both permission selects (the per-share row
select and the "default for new people" select) so the chevron has
breathing room without changing the left padding.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(share-dialog): make tab underline win the border-color cascade

The active tab used border-blue-600 alongside a base border-transparent.
Both are border-color utilities at the same specificity, so whichever
Tailwind emitted later in the stylesheet won — in practice the
transparent base, leaving no visible underline.

Switch to bottom-only border-b-transparent / border-b-blue-600 so the
active state targets border-bottom-color exclusively; no cascade
collision with the base.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(share-dialog): tab accent via aria-selected variant

Confirmed via devtools: border-b-blue-600 was on the DOM but computed
border-bottom-color was rgba(0,0,0,0). Same-specificity class collision
with border-b-transparent — Tailwind's emit order put the base last
and the conditional class lost the cascade.

Move all active styling onto aria-selected:* utilities so the active
selector becomes [aria-selected="true"], which has attribute-selector
specificity and beats the base. As a bonus the tabs now use one
declarative class string instead of four parallel [class.x] bindings.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(share-dialog): custom chevron on permission selects

Native select chevrons sit at a browser-fixed offset from the right edge
regardless of padding-right, so pr-7 / pr-8 just pushed the text away
from a chevron still crowded against the rounded-2xl corner.

Switch both permission selects to appearance-none + an overlaid
heroChevronDown icon. The wrapper handles positioning so the chevron
clears the rounded corner cleanly. pointer-events-none on the icon so
clicks still hit the native select.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(skills): capture tab-accent and select-chevron gotchas

Both fell out of the share-assistant-dialog redesign on PR #384 and
will bite future work the same way if undocumented:

- Tabs: conditional [class.border-b-blue-600] loses the cascade to a
  base border-b-transparent at the same specificity. Use the aria-selected:
  variant so the active rule has attribute-selector specificity.

- Selects: native chevrons sit at a browser-fixed offset from the right
  edge regardless of padding-right; with rounded-2xl they crowd the
  corner. appearance-none + overlaid heroChevronDown gives reliable
  positioning.

Adds a Tabs subsection, a Select example under Form pages, and a
"Common gotchas" section that names both failure modes and their
DevTools symptoms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds LibreChat (danny-avila/LibreChat) as external source #12 in the
kaizen-research skill. Releases-first, 1-2 web requests/week, covers
four lenses: UI/UX patterns, comparable-platform choices, MCP
integration patterns, and release-only signal. Bumps the subagent
fan-out count to 14 categories and surfaces LibreChat in the skill
description so it shows up in trigger context.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Silences four NG8113 build warnings — RouterLink was imported and
listed in `imports:` on these admin pages but never referenced in
their templates.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…387)

Adopt the rounded-2xl / border / text-sm-6 / primary-500 focus vocabulary
used by the canonical admin list/form pages so the chat settings panel no
longer reads as a separate visual generation. Keeps primary-* as the accent
and leaves the slide-over chrome (backdrop, transform animation) untouched.

Also fixes the native <select> chevron crowding the rounded-2xl corner on
the effort enum by switching to appearance-none + an overlaid heroicon.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
The stream_processor's `_format_force_stop_message` already produces
friendly user-facing markdown for known Bedrock force_stop patterns
(document size, throttling, access denied) — fully formed with a "⚠️"
prefix and actionable guidance. The in-loop error handler in
stream_coordinator then ran that text through
`build_conversational_error_event`, which for `AGENT_ERROR` fell into
the generic else branch and wrapped it again in:

  ⚠️ Something went wrong.

  > {already-friendly text}

  Please try again.

Result: two ⚠️ markers, the friendly text trapped in a blockquote, and
a ceremonial "Please try again." appended.

Detect the already-classified case by the leading ⚠️ on AGENT_ERROR
messages and emit a `ConversationalErrorEvent` directly with the
classifier's text intact. The unclassified "Agent force-stopped: {raw}"
fallthrough has no warning prefix and still flows through the generic
wrapper, so unrecognized errors keep their friendly wrapper. Live SSE
display and refresh-hydration both read `conv_error_event.message`, so
they stay in sync.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…esh (#389)

* fix(streaming): persist synthetic error messages so they survive refresh

When a Bedrock ValidationException fired (e.g. gpt-oss-120b + an attached
document), users saw an error in the chat live but the message disappeared
on refresh. Two root causes, plus follow-on cleanup.

1. Always-false persistence guard.
   Three sites guarded create_message behind
   `hasattr(session_manager, "base_manager")`. The current SDK exposes
   create_message directly on AgentCoreMemorySessionManager — no nested
   wrapper — so the guard was always False and every synthetic write was
   silently skipped (no log, no exception). Extracted a single helper
   `persist_synthetic_messages` that asserts the real SDK contract and
   logs loudly when create_message is unavailable. Replaced the three
   copies in stream_coordinator.py (2x) and chat/routes.py (1x). The
   quota-exceeded path in chat/routes.py was broken the same way and is
   now working as a side effect.

2. Duplicate user-turn write.
   The streaming error paths re-persisted the user turn even though
   Strands' MessageAddedEvent hook already wrote it at turn start. The
   conflicting second write caused AgentCore Memory to reject (in
   practice) or duplicate (in theory), dropping the assistant error
   message along with it. The streaming paths now persist assistant-only,
   matching the documented MAX_TOKENS reasoning.

3. Misclassified error copy.
   `_format_force_stop_message` was matching
   `ValidationException + "document"` as a 4.5 MB size overflow — but the
   raw string `"This model doesn't support documents"` also contains
   "document", so unsupported-modality errors were getting the wrong
   message. Added explicit branches for "doesn't support document(s)"
   and "doesn't support image(s)" before the size check; narrowed the
   size matcher to size-specific markers.

4. User-facing copy revised.
   Dropped brand names ("Claude or Nova"), non-actionable UI suggestions
   ("remove the attachment"), specific UI affordance references ("gear
   icon next to the message input"), and the Spreadsheet Analysis hint
   (not guaranteed enabled across deployments). Tests now assert these
   things are NOT present so regressions fail loudly.

Adds:
- agents/main_agent/session/persistence.py — single persist helper
- tests covering the helper, the classifier branches, and the persist
  contract end-to-end

All 161 streaming + persistence tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(streaming): collapse redundant AGENT_ERROR persist branch

After merging develop's #388 (stop double-wrapping classified force_stop
errors), conv_error_event.message holds the un-wrapped friendly text for
classified AGENT_ERROR cases — same string the content_block_delta yields
to the live SSE stream. The branch picking between error_message and
conv_error_event.message based on error_code is now functionally a no-op
for the classified path and a fix for the unclassified path (the live
"Agent force-stopped: …" wrapped template now also gets persisted instead
of the bare reason, keeping live and refresh-hydrated views in sync).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
… outer except (#390)

Follow-up cleanup after #388 and #389 landed the persistence + double-wrap
fixes. Three small architectural smells in the error-handling layer:

1. Dead classifier in stream_coordinator's outer except.
   #389 added _format_force_stop_message to the path 3 except handler on
   the theory that Bedrock ValidationException (gpt-oss-120b + document)
   bypasses stream_processor's force_stop branch via a GeneratorExit and
   escapes here. The trace path in test_force_stop_persistence.py:217-235
   shows it actually doesn't — process_agent_stream's own outer except
   catches stream_async exceptions first and yields STREAM_ERROR events,
   which the in-loop handler picks up. Path 3 only fires for failures in
   the coordinator's own loop body (interrupt extraction, artifact
   lookup, metadata calc), which don't carry Bedrock-y patterns the
   classifier could match. Removed the classifier branch and dropped the
   now-unused import.

2. Tightened comment on the RuntimeError vs Exception split.
   The split in stream_processor's outer except is load-bearing — the
   "generator"/"async" matched branch silently swallows async-generator
   state errors (e.g. "asynchronous generator is already running",
   "generator ignored GeneratorExit") that shouldn't surface as
   user-visible error events. Normal shutdown goes through the
   GeneratorExit handler above. Kept the structure, made the
   load-bearing part obvious in the comment.

3. Canonical-reference pointer in persistence.py docstring.
   The "user turn already persisted by MessageAddedEvent hook" invariant
   is documented at the persist call sites and in the helper's messages
   argument docstring. Added a top-level pointer so future readers know
   where the single source of truth lives.

Tests: 161 streaming/persistence + 26 chat tests pass. No behavior
change for any covered path.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…nned (#391)

* feat(devcontainer): add reproducible dev container

Adds a single-stage development image with every toolchain needed to build,
test, lint, deploy, and end-to-end-test every stack in the monorepo:

  • Backend  — Python 3.13 (managed by uv 0.7.12) + pytest, ruff, mypy, black
  • Frontend — Node.js 22.22.3 LTS + npm 11.2.0 + Angular CLI (project-local)
  • E2E      — Playwright 1.59.x chromium runtime libs + xvfb + fonts
  • Infra    — AWS CDK 2.1120.0 (matches infrastructure/package.json)
  • Cloud    — AWS CLI v2.34.40 (sha256 + PGP) + Docker CLI 29.4.3 (static)

Reproducibility posture (matches backend/Dockerfile.* conventions):

  • Base image pinned by multi-arch sha256 OCI image-index digest
    (ubuntu:24.04@sha256:c4a8d5503dfb…)
  • Every artifact downloaded from the network is verified against a sha256
    embedded as a build ARG, OR via PGP signature (AWS CLI installer)
  • uv copied from ghcr.io/astral-sh/uv:0.7.12 by sha256 — matches
    backend/Dockerfile.app-api and backend/Dockerfile.inference-api
  • Multi-arch build: TARGETARCH selects amd64 vs arm64 SHAs

Also includes:
  • .devcontainer/devcontainer.json — VS Code Dev Containers config
  • .devcontainer/aws-cli-public-key.gpg — AWS CLI Team PGP public key
    (Key ID A6310ACC4672475C, valid until 2026-07-07)
  • .devcontainer/README.md — usage, upgrades, Docker-in-Docker caveats

Verification of the build was not possible in the authoring environment
(no docker/podman available); next step is a docker buildx build for both
linux/amd64 and linux/arm64 followed by the verification commands listed
in .devcontainer/README.md.

* docs(devcontainer): document docker GID quirk in steering, drop devcontainer.json

The Dockerfile bakes its docker group at GID 999 (Debian/Ubuntu default).
WSL2 with Docker Desktop uses 1001, which made docker-in-docker fail with
'permission denied' on /var/run/docker.sock the first time we tested it.

Updates .kiro/steering/dev-environment.md to:

  • Reference the new agentcore-devcontainer image and .devcontainer/Dockerfile
    instead of the previous personal docker-compose-based devcontainer
  • Document the docker GID gotcha with a host-by-host action table
  • Show the canonical 'docker run' line that auto-resolves the host's
    docker GID via "--group-add $(getent group docker | cut -d: -f3)"
  • Spell out the workspace-path map (WSL host vs nspawn vs container)
    so agents pass the right path to '-v' bind-mounts
  • Switch to inclusion: always so every agent in the repo always knows
    which environment to execute commands in

Updates .devcontainer/README.md the same way for human readers — adds a
'The Docker GID Gotcha' section, swaps the build/run examples for ones
that auto-resolve DOCKER_GID, drops the VS Code Dev Containers section.

Removes .devcontainer/devcontainer.json. The team isn't on VS Code, and
without a consumer the file was dead weight.

* docs(steering): switch dev-environment.md to manual inclusion

Not every contributor uses the dev container — some run toolchains
directly on their host. Always-include would push container-only
execution rules onto sessions where they don't apply. Manual inclusion
lets contributors who do use the container reference the doc when they
need it without affecting everyone else.

* feat(devcontainer): add docker buildx CLI plugin (v0.30.1)

Without the buildx plugin, Docker 23+ refuses to build any Dockerfile
that uses BuildKit-only syntax — and every project Dockerfile in this
repo uses 'RUN --mount=type=cache,target=/root/.cache/uv' for fast
uv-cached dependency installs.

Symptom before this change, when running scripts/stack-app-api/build.sh
inside the dev container:

    Step 7/25 : RUN --mount=type=cache,target=/root/.cache/uv ...
    the --mount option requires BuildKit. Refer to ...

And with DOCKER_BUILDKIT=1:

    ERROR: BuildKit is enabled but the buildx component is missing or broken.

The fix installs the upstream docker/buildx static binary as a CLI plugin
under /usr/libexec/docker/cli-plugins/docker-buildx, which 'docker build'
auto-discovers and uses on Docker 23+.

Pinned to v0.30.1 with sha256s for both linux-amd64 and linux-arm64,
sourced directly from the docker/buildx GitHub release. Same posture as
every other downloaded artifact in this Dockerfile.

Verified by running scripts/stack-app-api/build.sh end-to-end inside the
rebuilt dev container — 1m8s build, produced devtest-app-api:latest at
797 MB, image starts and imports fastapi/uvicorn/strands cleanly with
APP_VERSION=1.0.0-beta.27 baked in.

---------

Co-authored-by: Kiro <kiro@boisestate.ai>
Adds a manual GitHub Actions workflow and script to tear down all
infrastructure. Stacks are destroyed in parallel (Phase 1) with
InfrastructureStack destroyed last (Phase 2) since all others depend on it.

Safety: requires typing DESTROY to confirm, uses environment-scoped
credentials, and is manual-trigger only (workflow_dispatch).

Co-authored-by: Colin <colin@boisestate.edu>
* feat(admin): curated model catalog with provider logos

Adds a curated model catalog landing page so admins can one-click add
fully-configured Bedrock models (Claude Haiku/Sonnet/Opus 4.x) with
pricing, modalities, and per-param specs already filled in. The "Add"
flow opens a role-picker dialog for per-deployment role IDs before
POSTing; "Preview & customize" hands the template to the model form via
a prefill service. The list-page "Add model" CTA now routes through the
catalog. Each card carries a light/dark provider logo (Anthropic,
Amazon, Meta, OpenAI) for brand recognition; OpenAI/Gemini tabs render
a "Coming soon" empty state until curated entries land. Documents the
@angular/cdk/dialog conventions used here in the frontend CLAUDE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(provider-logos): add dark and light SVG logos for OpenAI

* chore(model-catalog): sync OpenAI logo path to openai folder

The provider logo folder was renamed from open-aI to openai; update the
PROVIDER_LOGO_DIR mapping so the OpenAI tile resolves once curated
OpenAI entries land.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(model-catalog): finalize Opus 4.7 entry with real ID, pricing, params

Replaces the placeholder inference-profile ID with the real
`global.anthropic.claude-opus-4-7` alias and corrects pricing to the
actual $5/$25 per 1M tokens (cache $6.25 write / $0.5 read). Narrows
the supported-params surface to `max_tokens` + `effort` — Opus 4.7
exposes effort control instead of explicit thinking/temperature/top_*
knobs. Drops the now-obsolete TODO comment.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…te (#394)

Three threads from the same session, all rooted in the Sonnet 4.6
catalog entry not adding cleanly:

1. SupportedParams._check_thinking_invariants rejected float budgets,
   but DynamoDB roundtrips ints through Decimal -> float, so any
   stored model with a `thinking.default` failed model_validate on
   read. The list endpoint silently skipped invalid rows while the
   create endpoint's GSI key check still found them, producing
   "already exists" on POST + an invisible row in the list.
   Validator now accepts whole-number floats (coerces to int) and
   applies the same tolerance to the max_tokens comparison.

2. The curated Sonnet 4.6 template had thinking.default == max_tokens
   default (both 8192), which failed the "budget < max_tokens"
   invariant on create. Dropped thinking.default to 4096 to match
   Haiku 4.5. Also bumped the inference profile to claude-sonnet-4-6
   and dropped temperature default to 0.7.

3. Manage-models page had no loading state and used native confirm()
   for delete. Added a spinner + error/retry block mirroring the
   connector-list pattern, and a new DeleteModelDialogComponent
   following the AddCuratedModelDialog token convention (alertdialog,
   red destructive action, Escape/backdrop/Cancel all converge).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…395)

Match the redesigned list-page token set on the model edit/create form
(rounded-2xl, text-2xl/8 h1, ring-2 focus, dark:bg-gray-800 inputs) and
drop the heavy section cards in favor of flat sections divided by
border-t — same shape as the redesigned tool-form. Adds an isLoading
signal so the form shows a spinner card during the edit-mode fetch
instead of flashing an empty form.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…et downloads

Adds 25 MB hard-fail and 10 MB soft-warning thresholds (env-tunable via
ANALYZE_MAX_FILE_SIZE_BYTES / ANALYZE_WARN_FILE_SIZE_BYTES). The check
runs before _download_file using the size_bytes already on file_info, so
oversize files never hit S3 GetObject or base64. A logger.warning fires
at module load when the thresholds are misconfigured (warn >= max).
Soft warning is attached to both success and error responses for files
in the 10-25 MB range. Docstring updated with the new safety limit.

Closes #258 (items 1+2). Streaming (item 3) tracked separately.
…et downloads (#397)

Adds 25 MB hard-fail and 10 MB soft-warning thresholds (env-tunable via
ANALYZE_MAX_FILE_SIZE_BYTES / ANALYZE_WARN_FILE_SIZE_BYTES). The check
runs before _download_file using the size_bytes already on file_info, so
oversize files never hit S3 GetObject or base64. A logger.warning fires
at module load when the thresholds are misconfigured (warn >= max).
Soft warning is attached to both success and error responses for files
in the 10-25 MB range. Docstring updated with the new safety limit.

Closes #258 (items 1+2). Streaming (item 3) tracked separately.
…d' (#398)

The denominator in the selection counter was always 20 (MAX_SELECTION)
regardless of how many conversations the user has. A prior fix introduced
selectableCount to cap it at the loaded session count, but with lazy
loading we never know the true total — the number jumps as the user
loads more, making it misleading.

Removes the denominator entirely. The counter now reads '2 items selected'
instead of '2 of 10 selected'. Also removes the now-unused selectableCount
computed signal.

Closes #163
…size error (#403)

* fix(file-upload): fix duplicate document name error misclassified as size error (#400)

- stream_processor.py: narrow size-limit classifier to require explicit
  size markers (too large, exceeds, document size); add specific
  'duplicate document name' branch above it so the correct message is
  shown instead of the misleading 'file too large' message

- errors.py: add override in build_conversational_error_event for the
  raw-exception path (Path B) so the clean actionable message is shown
  regardless of which error code path the exception takes

- turn_based_session_manager.py: strip document blocks from history
  during compaction (same treatment as images), preventing the
  underlying duplicate-name condition from occurring when a user returns
  to the same session and re-attaches a same-named file

- prompt_builder.py: deduplicate document names within a single turn
  using a counter suffix (_2, _3) as belt-and-suspenders

- routes.py: deduplicate files across the files + file_upload_ids merge
  paths before partitioning

Fixes #400

* fix(file-upload): make document byte stripping unconditional, not compaction-gated

Extract document block byte-stripping from _truncate_tool_contents into
a dedicated _strip_document_bytes method called unconditionally from
initialize(), before the compaction-enabled gate.

_truncate_tool_contents is Stage 1 compaction and only runs when
AGENTCORE_MEMORY_COMPACTION_ENABLED=true. Stripping document bytes is a
correctness fix (prevents duplicate document name ValidationException),
not a compaction optimization — it should run regardless of whether
compaction is configured.
…on opt-in

Adds a "Conversation Modes" feature: admins manage a catalog of custom
system prompts ("Guided Learning", "Concise", "Caveman", etc.); users
opt in per conversation via the model-settings panel. The active mode
is appended to the base system prompt at invocation time.

Backend
- New apis.shared.system_prompts module: models / repository / service
  layered to mirror the user_menu_links convention. Snake_case wire
  format end-to-end. Optimistic concurrency on update so concurrent
  delete + edit can't resurrect a deleted prompt.
- New admin CRUD routes at /admin/system-prompts (full prompt_text
  visibility) and a user read endpoint at /system-prompts (name +
  description only - prompt_text is server-side only).
- New DynamoDB table provisioned by InfrastructureStack with PK/SK
  schema, point-in-time recovery, AWS-managed encryption. App API
  granted full CRUD; Inference API granted GetItem only.
- Inference path resolves the active prompt via a new
  system_prompt_resolver module. Gating skips on resume, continuation,
  preview, and assistant-attached turns (assistants are KB-grounded
  and a "mode" prompt could contradict their instructions).
- Selection precedence: request body first (so first-turn-of-new-session
  works without a metadata round-trip), session preferences as fallback
  (resume / refresh / new-device). Resolver mirrors the request-supplied
  id back onto session preferences via a focused
  set_selected_prompt_id helper that does a targeted SET (no SK
  rotation, no messageCount bump).
- UpdateSessionMetadataRequest uses null-as-clear semantics on
  selectedPromptId via Pydantic's model_fields_set, replacing the
  short-lived clearSelectedPrompt flag.

Frontend
- Lazy-loaded SystemPromptsService driven by BFF auth state; loader
  body wrapped in untracked() so the loader's signal traffic doesn't
  retrigger the auth effect (fixes the load loop seen on side-menu open).
- Active prompt is bound to a session id locally so cross-session
  navigation resets cleanly and home-page selections survive the
  transition to a brand-new session on first submit.
- Centralised setActivePrompt(sessionId, id|null) for both the
  settings panel and the chat-input chip; ToastService surfaces failed
  clears instead of swallowing them in the console.
- New admin pages for managing the catalog (list + form), and a
  per-conversation chip + radio group in the settings panel.
- Active prompt id is forwarded on every non-assistant submit so the
  inference path doesn't depend on metadata write timing.

Tests
- 32 system_prompts tests (model, repo, service, admin + user routes)
  covering snake_case wire format, Literal status validation, and
  TOCTOU-safe update.
- 5 sessions update tests including null-clear, omit-leaves-unchanged,
  and disabled-prompt rejection.
- 15 resolver tests covering gating, composition format, request-id
  precedence, persistence side-effect, and exception swallowing.
@DerrickF DerrickF requested a review from a team May 30, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants