Conversation Modes - Admin Created System Prompts - Opt In#411
Open
DerrickF wants to merge 1246 commits into
Open
Conversation Modes - Admin Created System Prompts - Opt In#411DerrickF wants to merge 1246 commits into
DerrickF wants to merge 1246 commits into
Conversation
…test skipping - Enable compaction by default in CompactionConfig - Increase protected_turns default from 2 to 3 - Add pytest marker to skip integration tests when AGENTCORE_MEMORY_ID is not set - Fix import path for get_metadata_storage in cache savings tests from metadata_storage.get_metadata_storage to storage.get_metadata_storage - Ensures integration tests only run in appropriate environments with required AWS credentials
… and cleanup - Mock AgentCoreMemorySessionManager.initialize() to simulate SDK behavior - Add _mock_sdk_initialize shim that loads messages and validates agent uniqueness - Track active patches in fixture scope for proper cleanup on teardown - Update fixture docstring to document initialize() mocking and message control - Convert fixture to generator with yield to enable patch cleanup - Allow tests to control loaded messages via mgr.read_agent and mgr.list_messages
…nsolidation, Trivy supply chain fix (#137)⚠️ BREAKING CHANGE: Authentication replaced with AWS Cognito. The legacy generic OIDC implementation has been removed with no backward compatibility layer. Existing deployments must re-bootstrap. Cognito First-Boot Authentication: - Cognito User Pool, App Client, and Domain provisioned in Infrastructure stack - CognitoJWTValidator replaces GenericOIDCJWTValidator - New system/ module for first-boot setup, Cognito user/group management - New cognito_idp_service for federated identity provider CRUD via Cognito IdP APIs - First-boot page with admin account creation (race-condition-safe DynamoDB writes) - Frontend auth flow rewritten for Cognito OAuth 2.0 + PKCE - Runtime-provisioner and runtime-updater Lambda functions removed (2,800+ lines) - Backend OIDC service, token exchange, and discovery endpoints removed (1,318 lines) - 2,057 lines of new Cognito test coverage (IdP service, JWT validator, first-boot, system) RBAC Consolidation: - Single require_app_roles dependency replaces 6 role-checking functions/decorators - User roles enriched from stored DynamoDB profile during token processing - Profile cache invalidation on sync for immediate role updates - JSON array parsing for custom:roles claim (Entra ID compatibility) - jwt_role_mappings updates allowed on system_admin role CORS Unification: - buildCorsOrigins() shared helper across all 6 CDK stacks - S3 CORS made conditional, ExposedHeaders→ExposeHeaders fix - Python APIs read CORS_ORIGINS env var (replaces allow_origins=['*']) Security: - Trivy action upgraded v0.28.0→v0.35.0 — old SHA was compromised in March 2026 supply chain attack (GHSA-69fq-xp46-6x23) CI/CD: - CDK_DOMAIN_NAME and CDK_CORS_ORIGINS added to all workflow jobs - App API synth-cdk actually skipped on PRs (guard was missing despite beta.20 docs) - SSM StringParameter creation guarded against empty values Bootstrap: - seed_bootstrap_data.py sole owner of RBAC role seeding (removed from app startup) - system_admin role seeded with jwt_role_mappings=['system_admin'] - Additive JWT mapping seeding for existing deployments Documentation: - 54,665 lines of outdated specs and AI artifacts purged (121 files) Dependencies: - Python: fastapi 0.135.3, uvicorn 0.44.0, boto3 1.42.83, strands-agents 1.34.1, bedrock-agentcore 1.6.0, google-genai 1.70.0, ruff 0.15.9, mypy 1.20.0 - Frontend: Angular 21.2.7, katex 0.16.45, mermaid 11.14.0, Analog.js alpha.26 - Infrastructure: aws-cdk-lib 2.248.0, aws-cdk 2.1117.0, ts-jest 29.4.9
….py (#139) Create agents/main_agent/config/constants.py with EnvVars, Defaults, and Prefixes classes. Update all 13 modules to import from the centralized constants instead of using inline os.getenv() with hardcoded strings. This eliminates scattered magic strings and provides a single reference for all configuration. Zero behavior change — all values are identical. 543/543 tests passing. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: centralize env vars and magic strings into config/constants.py Create agents/main_agent/config/constants.py with EnvVars, Defaults, and Prefixes classes. Update all 13 modules to import from the centralized constants instead of using inline os.getenv() with hardcoded strings. This eliminates scattered magic strings and provides a single reference for all configuration. Zero behavior change — all values are identical. 543/543 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract BaseAgent ABC and ChatAgent from MainAgent Split MainAgent into a three-tier hierarchy: - BaseAgent (ABC): shared init for model config, tools, session, streaming - ChatAgent(BaseAgent): Strands Agent creation and text streaming - MainAgent(ChatAgent): backward-compatible alias (pass-through) All existing callers continue to import and use MainAgent unchanged. The _build_filtered_tools() helper is extracted from _create_agent() for reuse by future agent types (SkillAgent, VoiceAgent). 543/543 tests passing — zero behavior change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce agent_types.py with a pluggable registry pattern: - create_agent(agent_type, **kwargs) → BaseAgent subclass - register_agent_type(name, cls) for dynamic registration - ChatAgent registered as "chat" by default Future agent types (skill, voice) will register themselves here. Existing code is unchanged — MainAgent still works as before. 552/552 tests passing (9 new factory tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement three-level skill architecture adapted from sample-strands-agent: - Level 1: Lightweight skill catalog injected into system prompt - Level 2: SKILL.md instructions loaded on-demand via skill_dispatcher - Level 3: Tool execution via skill_executor New modules: - skills/skill_registry.py: Discovers SKILL.md files, binds tools, serves catalog - skills/skill_tools.py: skill_dispatcher + skill_executor Strands @tool functions - skills/decorators.py: @Skill() decorator and register_skill() for tool tagging - skill_agent.py: SkillAgent(ChatAgent) with progressive disclosure override - skills/definitions/web-search/SKILL.md: Example skill definition SkillAgent registered as "skill" in agent_types factory. Existing behavior completely unchanged — SkillAgent is additive only. 590/590 tests passing (38 new skill tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…143) Implement VoiceAgent(BaseAgent) for bidirectional voice using Nova Sonic 2: - BidiNovaSonicModel with configurable voice, sample rate, and model - Voice-text continuity via _load_text_history() from text session - Separate agent_id ("voice") to prevent session state conflicts - Voice-optimized system prompt with conversational guidelines - PyAudio mock for server-side (browser uses Web Audio API) - Conditional registration — only available with strands-agents[bidi] Add voice-related constants to config/constants.py (EnvVars + Defaults). Register "voice" type in agent_types factory. 606/606 tests passing (16 new voice tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement three approval hook categories following the sample-strands-agent pattern, all using Strands BeforeToolCallEvent: - EmailApprovalHook: Gates send_email, delete_emails, forward_email, etc. - ExternalWriteApprovalHook: Gates create_pull_request, deploy, push_code, etc. - DangerousToolApprovalHook: Gates delete_file, drop_table, execute_sql, etc. Hooks set _approval_required/_approval_message on the tool_use dict for the streaming layer to surface to the client for user confirmation. All hooks registered in BaseAgent._create_hooks() — inherited by all agent types (ChatAgent, SkillAgent, VoiceAgent). 618/618 tests passing (12 new approval hook tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
) * feat: add bidi dependency, WebSocket voice route, and test client Wire up the VoiceAgent for end-to-end testing: - Add strands-agents[bidi] optional dependency group to pyproject.toml - Fix BidiAgent/BidiNovaSonicModel import paths (strands.experimental.bidi) - Create voice_routes.py with WebSocket endpoint at /voice/stream - JWT auth from query params (trusted decode, same as invocations) - Bidirectional protocol: audio/text input, agent event streaming - Debug endpoints: GET /voice/sessions, DELETE /voice/sessions/{id} - Register voice router in inference API main.py - Add test_voice_client.py script for manual WebSocket testing 632/632 tests passing (14 new voice route tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: handle CancelledError in VoiceAgent.stop() during teardown The BidiAgent's Nova Sonic stream teardown can raise CancelledError when pending AWS SDK futures are cancelled during shutdown. This is expected behavior, not an error. - VoiceAgent.stop(): catch CancelledError and Exception from BidiAgent - voice_routes.py finally block: catch BaseException (CancelledError is a BaseException in Python 3.12, escaping except Exception) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: pass session_id and agent_id to list_messages in voice history AgentCoreMemorySessionManager.list_messages() requires session_id and agent_id positional args. Pass session_id=self.session_id and agent_id="default" to read the text chat agent's history for voice-text continuity. Use the SDK's limit param instead of post-slicing. Update tests to verify the correct call signature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use BidiAgent.receive() for voice event streaming BidiAgent uses receive() as its event source, not stream_async(). Audio/text input is sent via send_audio()/send_text() separately, and receive() yields typed events (BidiAudioStreamEvent, BidiTranscriptStreamEvent, etc.) asynchronously. - VoiceAgent.stream_async(): iterate BidiAgent.receive(), yield event.as_dict() for JSON-serializable dicts - voice_routes._send_to_client(): simplified to handle dicts directly since stream_async now yields dicts, not strings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Angular voice components for Nova Sonic bidirectional audio Frontend voice support with three-layer architecture: New services (frontend/ai.client/src/app/session/services/voice/): - pcm-utils.ts: Pure PCM encoding/decoding (Float32↔Int16↔base64) - AudioRecorderService: Mic capture via Web Audio API → 16kHz PCM chunks - AudioPlayerService: Gapless base64 PCM playback with interruption support - VoiceChatService: WebSocket orchestration + state machine (idle → connecting → listening → speaking) Modified components: - chat-input: Voice toggle button with animated state indicators (pulsing red = listening, bouncing green = speaking, spinner = connecting) - chat-input template: Live transcript overlay during voice mode - session.page.ts: Wire voice response completions to message list - MessageMapService: addVoiceMessage() for finalized voice transcripts TypeScript compiles cleanly (tsc --noEmit). Angular build requires Node 20.19+ (current machine has 20.18.1). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: convert SessionMessage to dict for BidiAgent and fix TS2774 Backend: _load_text_history() now calls .to_dict() on SessionMessage objects before passing to BidiAgent. Nova Sonic expects plain dicts with {"role": "...", "content": [...]}, not SessionMessage objects. Frontend: Fix TS2774 in AudioRecorderService — use typeof check instead of truthiness check for getUserMedia function detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use to_message() instead of to_dict() for BidiAgent history SessionMessage.to_dict() wraps the message in metadata: {"message": {"role": ..., "content": [...]}, "message_id": 0, ...} SessionMessage.to_message() returns the plain message dict: {"role": "user", "content": [...]} Nova Sonic's _get_message_history_events expects the plain format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use BidiAgent.send() and receive() APIs correctly BidiAgent has send(dict) and receive() — not send_audio()/send_text() or stream_async(). Align VoiceAgent methods with the actual SDK: - send_audio(): calls self._bidi_agent.send({"type": "bidi_audio_input", ...}) - send_text(): calls self._bidi_agent.send({"type": "bidi_text_input", ...}) - receive_events(): wraps self._bidi_agent.receive() with as_dict() conversion - stream_async(): now a no-op stub (voice uses receive_events() instead) Update voice_routes._send_to_client to call receive_events() not stream_async(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Implement feature X to enhance user experience and optimize performance * feat: add voice overlay component for voice interactions - Implemented VoiceOverlayComponent with HTML, CSS, and TypeScript files. - Added styles for visualizer orb and status badges using Tailwind CSS. - Integrated voice status management and session handling in the component. - Enhanced voice chat service to support transcript entries and reveal logic. - Updated session page to handle voice overlay closure and persist transcripts as messages. - Introduced configuration constants for voice processing parameters. * feat: enhance voice agent with real-time cost calculation and metadata handling * fix: refine token usage handling and improve message processing in voice components * fix: sanitize user-provided values in log statements to prevent log injection Addresses CodeQL alert #567 (py/log-injection). All user-provided values (session_id, user_id, msg_type, enabled_tools) are now passed through _sanitize_log() which strips newline and carriage return characters before being interpolated into log messages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: update WebSocket voice streaming endpoint for AgentCore compatibility * fix: ensure config message is required for WebSocket voice stream authentication
…ig (#156) * feat: update WebSocket voice streaming endpoint for AgentCore compatibility * fix: ensure config message is required for WebSocket voice stream authentication * feat: add protocol configuration for HTTP support in InferenceApiStack
…attern (#159) * fix: align voice WebSocket with reference architecture accept-first pattern Rewrites voice_stream to match the sample-strands-agent-with-agentcore reference architecture: - Accept WebSocket immediately (AgentCore validates auth at proxy layer) - Extract params via helper functions: custom header → query param → config message - Config message always read to supplement missing params in cloud mode - /voice/stream as main route, /ws as alias for AgentCore Runtime - Frontend uses /voice/stream for local dev, /ws for AgentCore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add missing try block in voice_stream causing IndentationError Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* basic e2e testing (not hooked up to nightly) * get rid of warnings * add all home page tests. * settings and assistants page tests * Rebase e2e testing branch (#164) * test(session): update compaction config defaults and fix integration test skipping - Enable compaction by default in CompactionConfig - Increase protected_turns default from 2 to 3 - Add pytest marker to skip integration tests when AGENTCORE_MEMORY_ID is not set - Fix import path for get_metadata_storage in cache savings tests from metadata_storage.get_metadata_storage to storage.get_metadata_storage - Ensures integration tests only run in appropriate environments with required AWS credentials * test(session): enhance session manager fixture with initialize() mock and cleanup - Mock AgentCoreMemorySessionManager.initialize() to simulate SDK behavior - Add _mock_sdk_initialize shim that loads messages and validates agent uniqueness - Track active patches in fixture scope for proper cleanup on teardown - Update fixture docstring to document initialize() mocking and message control - Convert fixture to generator with yield to enable patch cleanup - Allow tests to control loaded messages via mgr.read_agent and mgr.list_messages * Release 1.0.0-beta.22: Cognito-native auth, CORS unification, RBAC consolidation, Trivy supply chain fix (#137)⚠️ BREAKING CHANGE: Authentication replaced with AWS Cognito. The legacy generic OIDC implementation has been removed with no backward compatibility layer. Existing deployments must re-bootstrap. Cognito First-Boot Authentication: - Cognito User Pool, App Client, and Domain provisioned in Infrastructure stack - CognitoJWTValidator replaces GenericOIDCJWTValidator - New system/ module for first-boot setup, Cognito user/group management - New cognito_idp_service for federated identity provider CRUD via Cognito IdP APIs - First-boot page with admin account creation (race-condition-safe DynamoDB writes) - Frontend auth flow rewritten for Cognito OAuth 2.0 + PKCE - Runtime-provisioner and runtime-updater Lambda functions removed (2,800+ lines) - Backend OIDC service, token exchange, and discovery endpoints removed (1,318 lines) - 2,057 lines of new Cognito test coverage (IdP service, JWT validator, first-boot, system) RBAC Consolidation: - Single require_app_roles dependency replaces 6 role-checking functions/decorators - User roles enriched from stored DynamoDB profile during token processing - Profile cache invalidation on sync for immediate role updates - JSON array parsing for custom:roles claim (Entra ID compatibility) - jwt_role_mappings updates allowed on system_admin role CORS Unification: - buildCorsOrigins() shared helper across all 6 CDK stacks - S3 CORS made conditional, ExposedHeaders→ExposeHeaders fix - Python APIs read CORS_ORIGINS env var (replaces allow_origins=['*']) Security: - Trivy action upgraded v0.28.0→v0.35.0 — old SHA was compromised in March 2026 supply chain attack (GHSA-69fq-xp46-6x23) CI/CD: - CDK_DOMAIN_NAME and CDK_CORS_ORIGINS added to all workflow jobs - App API synth-cdk actually skipped on PRs (guard was missing despite beta.20 docs) - SSM StringParameter creation guarded against empty values Bootstrap: - seed_bootstrap_data.py sole owner of RBAC role seeding (removed from app startup) - system_admin role seeded with jwt_role_mappings=['system_admin'] - Additive JWT mapping seeding for existing deployments Documentation: - 54,665 lines of outdated specs and AI artifacts purged (121 files) Dependencies: - Python: fastapi 0.135.3, uvicorn 0.44.0, boto3 1.42.83, strands-agents 1.34.1, bedrock-agentcore 1.6.0, google-genai 1.70.0, ruff 0.15.9, mypy 1.20.0 - Frontend: Angular 21.2.7, katex 0.16.45, mermaid 11.14.0, Analog.js alpha.26 - Infrastructure: aws-cdk-lib 2.248.0, aws-cdk 2.1117.0, ts-jest 29.4.9 * refactor: centralize env vars and magic strings into config/constants.py (#139) Create agents/main_agent/config/constants.py with EnvVars, Defaults, and Prefixes classes. Update all 13 modules to import from the centralized constants instead of using inline os.getenv() with hardcoded strings. This eliminates scattered magic strings and provides a single reference for all configuration. Zero behavior change — all values are identical. 543/543 tests passing. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract BaseAgent ABC and ChatAgent from MainAgent (#140) * refactor: centralize env vars and magic strings into config/constants.py Create agents/main_agent/config/constants.py with EnvVars, Defaults, and Prefixes classes. Update all 13 modules to import from the centralized constants instead of using inline os.getenv() with hardcoded strings. This eliminates scattered magic strings and provides a single reference for all configuration. Zero behavior change — all values are identical. 543/543 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract BaseAgent ABC and ChatAgent from MainAgent Split MainAgent into a three-tier hierarchy: - BaseAgent (ABC): shared init for model config, tools, session, streaming - ChatAgent(BaseAgent): Strands Agent creation and text streaming - MainAgent(ChatAgent): backward-compatible alias (pass-through) All existing callers continue to import and use MainAgent unchanged. The _build_filtered_tools() helper is extracted from _create_agent() for reuse by future agent types (SkillAgent, VoiceAgent). 543/543 tests passing — zero behavior change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add agent type registry and create_agent() factory (#141) Introduce agent_types.py with a pluggable registry pattern: - create_agent(agent_type, **kwargs) → BaseAgent subclass - register_agent_type(name, cls) for dynamic registration - ChatAgent registered as "chat" by default Future agent types (skill, voice) will register themselves here. Existing code is unchanged — MainAgent still works as before. 552/552 tests passing (9 new factory tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add progressive skill disclosure system with SkillAgent (#142) Implement three-level skill architecture adapted from sample-strands-agent: - Level 1: Lightweight skill catalog injected into system prompt - Level 2: SKILL.md instructions loaded on-demand via skill_dispatcher - Level 3: Tool execution via skill_executor New modules: - skills/skill_registry.py: Discovers SKILL.md files, binds tools, serves catalog - skills/skill_tools.py: skill_dispatcher + skill_executor Strands @tool functions - skills/decorators.py: @Skill() decorator and register_skill() for tool tagging - skill_agent.py: SkillAgent(ChatAgent) with progressive disclosure override - skills/definitions/web-search/SKILL.md: Example skill definition SkillAgent registered as "skill" in agent_types factory. Existing behavior completely unchanged — SkillAgent is additive only. 590/590 tests passing (38 new skill tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add VoiceAgent with BidiAgent for speech-to-speech interaction (#143) Implement VoiceAgent(BaseAgent) for bidirectional voice using Nova Sonic 2: - BidiNovaSonicModel with configurable voice, sample rate, and model - Voice-text continuity via _load_text_history() from text session - Separate agent_id ("voice") to prevent session state conflicts - Voice-optimized system prompt with conversational guidelines - PyAudio mock for server-side (browser uses Web Audio API) - Conditional registration — only available with strands-agents[bidi] Add voice-related constants to config/constants.py (EnvVars + Defaults). Register "voice" type in agent_types factory. 606/606 tests passing (16 new voice tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add approval hooks for gating dangerous tool operations (#144) Implement three approval hook categories following the sample-strands-agent pattern, all using Strands BeforeToolCallEvent: - EmailApprovalHook: Gates send_email, delete_emails, forward_email, etc. - ExternalWriteApprovalHook: Gates create_pull_request, deploy, push_code, etc. - DangerousToolApprovalHook: Gates delete_file, drop_table, execute_sql, etc. Hooks set _approval_required/_approval_message on the tool_use dict for the streaming layer to surface to the client for user confirmation. All hooks registered in BaseAgent._create_hooks() — inherited by all agent types (ChatAgent, SkillAgent, VoiceAgent). 618/618 tests passing (12 new approval hook tests). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add WebSocket voice route and bidi dependency for VoiceAgent (#145) * feat: add bidi dependency, WebSocket voice route, and test client Wire up the VoiceAgent for end-to-end testing: - Add strands-agents[bidi] optional dependency group to pyproject.toml - Fix BidiAgent/BidiNovaSonicModel import paths (strands.experimental.bidi) - Create voice_routes.py with WebSocket endpoint at /voice/stream - JWT auth from query params (trusted decode, same as invocations) - Bidirectional protocol: audio/text input, agent event streaming - Debug endpoints: GET /voice/sessions, DELETE /voice/sessions/{id} - Register voice router in inference API main.py - Add test_voice_client.py script for manual WebSocket testing 632/632 tests passing (14 new voice route tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: handle CancelledError in VoiceAgent.stop() during teardown The BidiAgent's Nova Sonic stream teardown can raise CancelledError when pending AWS SDK futures are cancelled during shutdown. This is expected behavior, not an error. - VoiceAgent.stop(): catch CancelledError and Exception from BidiAgent - voice_routes.py finally block: catch BaseException (CancelledError is a BaseException in Python 3.12, escaping except Exception) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: pass session_id and agent_id to list_messages in voice history AgentCoreMemorySessionManager.list_messages() requires session_id and agent_id positional args. Pass session_id=self.session_id and agent_id="default" to read the text chat agent's history for voice-text continuity. Use the SDK's limit param instead of post-slicing. Update tests to verify the correct call signature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use BidiAgent.receive() for voice event streaming BidiAgent uses receive() as its event source, not stream_async(). Audio/text input is sent via send_audio()/send_text() separately, and receive() yields typed events (BidiAudioStreamEvent, BidiTranscriptStreamEvent, etc.) asynchronously. - VoiceAgent.stream_async(): iterate BidiAgent.receive(), yield event.as_dict() for JSON-serializable dicts - voice_routes._send_to_client(): simplified to handle dicts directly since stream_async now yields dicts, not strings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Angular voice components for Nova Sonic bidirectional audio Frontend voice support with three-layer architecture: New services (frontend/ai.client/src/app/session/services/voice/): - pcm-utils.ts: Pure PCM encoding/decoding (Float32↔Int16↔base64) - AudioRecorderService: Mic capture via Web Audio API → 16kHz PCM chunks - AudioPlayerService: Gapless base64 PCM playback with interruption support - VoiceChatService: WebSocket orchestration + state machine (idle → connecting → listening → speaking) Modified components: - chat-input: Voice toggle button with animated state indicators (pulsing red = listening, bouncing green = speaking, spinner = connecting) - chat-input template: Live transcript overlay during voice mode - session.page.ts: Wire voice response completions to message list - MessageMapService: addVoiceMessage() for finalized voice transcripts TypeScript compiles cleanly (tsc --noEmit). Angular build requires Node 20.19+ (current machine has 20.18.1). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: convert SessionMessage to dict for BidiAgent and fix TS2774 Backend: _load_text_history() now calls .to_dict() on SessionMessage objects before passing to BidiAgent. Nova Sonic expects plain dicts with {"role": "...", "content": [...]}, not SessionMessage objects. Frontend: Fix TS2774 in AudioRecorderService — use typeof check instead of truthiness check for getUserMedia function detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use to_message() instead of to_dict() for BidiAgent history SessionMessage.to_dict() wraps the message in metadata: {"message": {"role": ..., "content": [...]}, "message_id": 0, ...} SessionMessage.to_message() returns the plain message dict: {"role": "user", "content": [...]} Nova Sonic's _get_message_history_events expects the plain format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use BidiAgent.send() and receive() APIs correctly BidiAgent has send(dict) and receive() — not send_audio()/send_text() or stream_async(). Align VoiceAgent methods with the actual SDK: - send_audio(): calls self._bidi_agent.send({"type": "bidi_audio_input", ...}) - send_text(): calls self._bidi_agent.send({"type": "bidi_text_input", ...}) - receive_events(): wraps self._bidi_agent.receive() with as_dict() conversion - stream_async(): now a no-op stub (voice uses receive_events() instead) Update voice_routes._send_to_client to call receive_events() not stream_async(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Implement feature X to enhance user experience and optimize performance * feat: add voice overlay component for voice interactions - Implemented VoiceOverlayComponent with HTML, CSS, and TypeScript files. - Added styles for visualizer orb and status badges using Tailwind CSS. - Integrated voice status management and session handling in the component. - Enhanced voice chat service to support transcript entries and reveal logic. - Updated session page to handle voice overlay closure and persist transcripts as messages. - Introduced configuration constants for voice processing parameters. * feat: enhance voice agent with real-time cost calculation and metadata handling * fix: refine token usage handling and improve message processing in voice components * fix: sanitize user-provided values in log statements to prevent log injection Addresses CodeQL alert #567 (py/log-injection). All user-provided values (session_id, user_id, msg_type, enabled_tools) are now passed through _sanitize_log() which strips newline and carriage return characters before being interpolated into log messages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: WebSocket voice streaming with AgentCore auth support (#155) * feat: update WebSocket voice streaming endpoint for AgentCore compatibility * fix: ensure config message is required for WebSocket voice stream authentication * feat: WebSocket voice streaming with AgentCore auth and protocol config (#156) * feat: update WebSocket voice streaming endpoint for AgentCore compatibility * fix: ensure config message is required for WebSocket voice stream authentication * feat: add protocol configuration for HTTP support in InferenceApiStack * fix: include bidi dependency in uv sync commands for Inference API Dockerfile (#157) * fix: improve AgentCore connection detection in voice stream handling (#158) * fix: align voice WebSocket with reference architecture accept-first pattern (#159) * fix: align voice WebSocket with reference architecture accept-first pattern Rewrites voice_stream to match the sample-strands-agent-with-agentcore reference architecture: - Accept WebSocket immediately (AgentCore validates auth at proxy layer) - Extract params via helper functions: custom header → query param → config message - Config message always read to supplement missing params in cloud mode - /voice/stream as main route, /ws as alias for AgentCore Runtime - Frontend uses /voice/stream for local dev, /ws for AgentCore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add missing try block in voice_stream causing IndentationError Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Voice Mode to Key Features in README (#160) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: colinmxs <colinmxs@users.noreply.github.com> Co-authored-by: Colin Smith <7762103+colinmxs@users.noreply.github.com> Co-authored-by: Phil Merrell <philmerrell@boisestate.edu> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * small testing fix * add e2e to nightly process * fix tests / warnings --------- Co-authored-by: Oscar Filson <OSCARFILSON@boisestate.edu> Co-authored-by: colinmxs <colinmxs@users.noreply.github.com> Co-authored-by: Colin Smith <7762103+colinmxs@users.noreply.github.com> Co-authored-by: Phil Merrell <philmerrell@boisestate.edu> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(connectors): add AgentCore Identity wrapper and Runtime context middleware
First phase of the Connectors refactor, which will eventually replace the
bespoke OAuth token store (OAuthTokenRepository, KMS-encrypted DynamoDB,
Secrets Manager client credentials, manual refresh) with AgentCore Identity's
managed token vault and credential providers.
- AgentCoreContextMiddleware copies the four Runtime headers
(WorkloadAccessToken, OAuth2CallbackUrl, session ID, request ID) into
BedrockAgentCoreContext on every invocation. Required because the Inference
API is a plain FastAPI app rather than BedrockAgentCoreApp, so the SDK does
not populate the context for us. No-op when headers are absent, so local
development and unit tests continue to work without mocks.
- AgentCoreIdentityClient wraps IdentityClient.get_token() with a narrower,
platform-friendly surface for USER_FEDERATION (3LO) flows. Surfaces the
"user consent required" case as a structured TokenResult(authorization_url=...)
rather than an exception, so it can flow through the existing SSE stream as
a new event type in a later phase.
Both modules are pure additions; no existing code path calls them yet.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(connectors): route external MCP OAuth through AgentCore Identity
Wires the Runtime context middleware into the Inference API and swaps the
external MCP client's token source from the bespoke OAuthService to
AgentCore Identity's USER_FEDERATION flow.
- main.py: installs AgentCoreContextMiddleware so WorkloadAccessToken and
OAuth2CallbackUrl Runtime headers populate BedrockAgentCoreContext on every
invocation.
- external_mcp_client.py: _get_oauth_token now returns a TokenResult from
AgentCoreIdentityClient instead of a decrypted token string from
OAuthService. Scopes are read from the platform's OAuth provider record so
organizations can change them without code. When the SDK signals that user
consent is required, the authorization URL is stashed per-user for the
inference route to surface via an oauth_required SSE event (emitter to
follow in a subsequent commit). load_external_tools skips client creation
on consent-required rather than creating a client that would fail at the
first request.
- Convention: the platform's provider_id is used verbatim as the AgentCore
Identity credential-provider name. Admins register matching names via
CreateOauth2CredentialProvider during provider setup.
The OAuthService, token vault, and encryption layer are still referenced by
unrelated code paths (admin routes, connections UI) and will be removed in
Phase 3 once the AgentCore-backed flow is validated end-to-end.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(frontend): rename connections to connectors
Rebrand the user-facing OAuth UI from "connections" to "connectors" for
consistent vernacular across the product. Folders, classes, types, and
route paths all follow the new name; the /settings/connections URL
redirects to /settings/connectors. The backend /oauth/connections
endpoint is preserved as a stable contract and translated at the
service layer.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(connectors): add AgentCore credential-provider registrar service
Wraps bedrock-agentcore-control for admin-side OAuth2 credential provider
CRUD: create/update/delete/get with vendor mapping (Google/Microsoft/GitHub
to their native vendors; Canvas/Custom routed through CustomOauth2 via an
OIDC discovery URL or explicit authorization-server metadata). Domain
errors map 404/conflict/invalid-custom to typed exceptions so route
handlers can translate cleanly.
Update is intentionally non-partial: AgentCore's UpdateOauth2CredentialProvider
requires a full oauth2ProviderConfigInput and Get never returns the stored
client_secret, so credential rotation always re-submits both clientId and
clientSecret.
17 unit tests cover every vendor path, error mapping, and the Custom-only
discovery rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(connectors): grant IAM for credential-provider admin ops
Adds Create/Update/Delete/Get/List on bedrock-agentcore OAuth2 credential
providers to the app-api task role, scoped to the default token vault.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(connectors): retire in-house OAuth flow
Deletes the legacy 3LO dance that predates AgentCore Identity — the
per-user token vault, PKCE-based authorization service, encryption layer,
token cache, user-facing /oauth/* routes, and the tool-side OAuthToolService.
AgentCore Identity owns the token vault and consent flow now; the inference
path already routes through agentcore_identity.py via the recent external
MCP client refactor, so these modules had no live consumers.
Also slims shared/oauth/__init__.py to the surviving surface (provider model,
repository, registrar) and unwires the user-facing router from app_api/main.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(connectors): slim OAuth provider model to AgentCore shape
AgentCore Identity owns the clientId, clientSecret, endpoint config, and
callback URL. Our DynamoDB record keeps only the admin metadata (display
name, scopes, role gates, icon) plus cached pointers to AgentCore's record
(credential_provider_arn, callback_url) for convenience.
Drops authorization_endpoint, token_endpoint, authorization_params,
userinfo_endpoint, revocation_endpoint, pkce_required, OAuthUserToken, and
the user-side connection DTOs — all artifacts of the retired in-house flow.
Adds oauth_discovery_url and authorization_server_metadata for Custom/Canvas
providers, gated by a pydantic validator.
Repository surface tightens to put_provider + apply_metadata_update; the
Secrets Manager write/read path is gone. Admin routes (commit next) own
the AgentCore round-trip and hand a fully-formed record to the repo.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(connectors): route admin OAuth CRUD through AgentCore Identity
POST now calls the registrar first and, on success, upserts the metadata
record in DynamoDB. If the DB write fails after AgentCore has accepted
the credentials, we best-effort delete the AgentCore provider to avoid
orphans.
PATCH distinguishes metadata-only edits (scopes, roles, display name,
icon, enabled) from credential rotation. Rotation requires clientId +
clientSecret together — partial updates are rejected by AgentCore's
UpdateOauth2CredentialProvider contract.
DELETE removes the AgentCore provider first (which revokes every user
token stored in its vault), then the local record. Pre-existing connection-
count checks are dropped since per-user tokens no longer live in our DB.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(connectors): rewire frontend for AgentCore flow
Admin side:
- Rename admin/oauth-providers → admin/connectors (file + route); old
route path redirects for URL stability
- Rewrite the admin model to the AgentCore-owned shape: drop endpoint
fields, authorization_params, pkce_required, userinfo/revocation
endpoints. Add credential_provider_arn, callback_url, and
oauth_discovery_url / authorization_server_metadata for Custom vendors
- Rewrite the admin form: preset picker simplified to display metadata
only, Custom requires an OIDC discovery URL, credential rotation
requires clientId + clientSecret together (AgentCore's update API is
not partial), success screen after create displays the AgentCore
callback URL with a copy button so the admin can paste it into the
vendor console, edit mode shows the callback URL + ARN read-only
User-facing retirement:
- Delete settings/connectors (user "my connected accounts" page),
settings/oauth-callback (legacy 3LO return handler), and the sidebar
+ route entries for them. AgentCore Identity owns the consent flow
at runtime via the existing /oauth-complete landing page
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: gitignore .claude/scheduled_tasks.lock
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(connectors): emit oauth_required events + runtime consent UI
When an external MCP tool needs OAuth consent, AgentCore Identity returns
an authorization URL instead of a token. This wires that signal all the
way to the user:
Backend:
- Inference route drains pending consent URLs from the external MCP
integration after the agent stream finishes and emits one
oauth_required SSE event per provider before done
- IAM grants bedrock-agentcore:GetResourceOauth2Token on the runtime role
so the AgentCore Identity client can reach the token vault
- CLAUDE.MD + SSE_ERROR_MESSAGING.md document the new event
Frontend:
- Stream parser recognizes oauth_required and surfaces it as an
OAuthRequiredEvent
- New /oauth-complete landing page handles the AgentCore callback
redirect and postMessages consent completion to the opener tab
- OAuthConsentService orchestrates popup opening + postMessage receipt
- OAuthConsentBanner renders the Connect button inside the chat input
- chat-http and assistant preview pass OAuth2CallbackUrl header so
AgentCore Runtime knows where to return after consent
Also updates the admin Tool form reference from /admin/oauth-providers
to /admin/connectors to match the renamed admin surface.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(connectors): user-facing settings page + AgentCore consent finalizer
Adds the Settings → Connectors page so users can browse and connect
OAuth-backed external tools end-to-end:
- New /connectors routers on app-api (list user-visible providers via
RBAC) and inference-api (initiate-consent, complete-consent) — the
inference-api side runs under the AgentCore Runtime proxy where the
WorkloadAccessToken context is populated.
- AgentCoreIdentityClient gains a workload-token mint fallback for local
dev (GetWorkloadAccessTokenForUserId) and appends provider_id to the
callback URL so the landing page can dismiss the right banner.
- /oauth-complete page POSTs CompleteResourceTokenAuth back through the
inference-api before notifying the opener, fixing the "consent
finished but vault stayed empty" race. Uses BroadcastChannel to
bridge popup → opener under Chrome's COOP isolation.
- New connectors settings page with a Connect / Reconnect affordance
per provider, wired to the OAuthConsentService popup flow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(connectors): switch oauth gating from pre-flight to mid-turn interrupts
The agent used to pre-flight OAuth at tool-load time and abort the whole
turn if any provider needed consent — the user then had to retype the
prompt after authorizing. This switches to the Strands interrupt
protocol: the consent gate runs lazily before each tool call, pauses
the in-flight turn, and resumes it automatically once the user
finishes the popup.
Backend
- New OAuthConsentHook (BeforeToolCallEvent + AfterToolCallEvent).
- BeforeToolCall: looks up the OAuth provider for the selected
MCPAgentTool's MCPClient (no name coupling), checks the in-process
token cache, and either lets the tool run or calls
event.interrupt(...) with the consent URL when AgentCore Identity
reports consent required.
- AfterToolCall: detects 401-style failures from MCP tool results,
marks the (user, provider) for force_authentication on the next
fetch, and sets event.retry = True so the BeforeToolCall hook
re-fires and triggers a fresh consent. Closes the gap where a
provider-side revocation leaves a stale token in AgentCore's vault.
- New oauth_token_cache: per-(user, provider) tokens + force-reauth
flags; lifecycle-managed by the hook.
- ExternalMCPIntegration always loads MCP clients with a lazy
token_provider that reads from the cache; the pending_consent /
drain_pending_consent dict and the route's pre-LLM short-circuit
branch are gone.
- StreamCoordinator emits one oauth_required SSE event per pending
interrupt before the final done event, carrying interruptId so the
frontend can resume the same turn.
- ChatAgent.stream_async accepts interrupt_responses and forwards them
to Strands as the resume prompt; route accepts the same on
/invocations and skips quota + RAG augmentation on resume.
Frontend
- OAuthRequiredEvent type + validator gain interruptId; settings-page
consent path makes interruptId optional (no agent turn to resume).
- OAuthConsentService tracks the interruptId per request and invokes a
registered resume handler on broadcast success.
- ChatRequestService snapshots the last turn's payload and replays it
with interrupt_responses attached when a consent completes — the
user never retypes the prompt.
Smoke-tested end-to-end: Google revoke → whoami → 401 → AfterToolCall
detects + retries → fresh consent banner → popup → auto-resume → tool
returns greeting in the same turn.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(connectors): bind complete_consent to initiating user + tighten auth-failure regex
Hardens two gaps called out in review of the AgentCore OAuth flow.
- `/connectors/complete-consent` now verifies the submitted `session_uri`
was issued to the authenticated user at `initiate_consent`, rejecting
cross-user replay with 403 before ever calling AgentCore. Backed by a
thread-safe TTL cache (10 min, single-use). Soft-fails with a warning
when AgentCore's authorize URL doesn't carry a recognised session
parameter, so an SDK shape change logs rather than blocks.
- `_AUTH_FAILURE_PATTERN` tightened with word boundaries on every clause
and a non-path guard on `401` so tool errors containing `/v1/401/...`
no longer trigger a spurious force-reauth.
Also moves `import boto3`/`os` out of the `complete_consent` handler
body and caches the control-plane client via `lru_cache`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(connectors): type-assert AgentCore responses + harden create rollback
Addresses the remaining two critical items from PR #174 review.
Registrar response parsing (`_info_from_response`): fails loudly on
contract violations rather than silently storing empty strings. Missing
`clientSecretArn` still tolerated (some vendors won't persist one) but
a wrong-shape `clientSecretArn` or absent `credentialProviderArn` now
raises TypeError so an AgentCore API change surfaces as a real error.
Admin create-provider rollback (`_rollback_orphaned_provider`): now
retries the AgentCore delete twice with backoff before giving up.
On exhaustion, emits a CloudWatch `Agentcore/OAuth::ProviderOrphaned`
custom metric so ops can alarm on stranded credential providers.
Secondary failures (CW down, registrar down after retries) never
shadow the admin's original 5xx — they only log. The subsequent
create attempt that hits `CredentialProviderConflictError` with no
DB record now returns an actionable 409 pointing at the AWS CLI
cleanup command instead of a bare "already exists".
App API task role grants `cloudwatch:PutMetricData` scoped to the
`Agentcore/OAuth` namespace via a condition key.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(connectors): harden oauth consent flow per code review
- Reject non-https authorizationUrls at both intake and open time so a
compromised backend can't smuggle javascript:/data: URIs into a user
click.
- Replace window.location.href hijack on popup-block with a blocked
signal; the banner renders an "Open in new tab" anchor instead of
tearing down the chat tab.
- Reject resume requests whose interruptIds aren't present in the cached
agent's _interrupt_state with 400, preventing silent acceptance after
cache eviction, process restart, or forged payloads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(connectors): drop provider_id from MCP tool load log
CodeQL flagged the provider_id interpolation as clear-text logging of
sensitive data — its taint analysis traces provider_id back through the
OAuth credential path. The provider ID itself isn't secret, but the log
line doesn't need it: tool_id already identifies the tool, and
"(OAuth)" alone confirms auth was wired up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(connectors): remove obsolete pre-flight oauth + required-message tests
Both tests codify behavior that commit b55653d intentionally retired:
- TestInvocationsOAuthRequired exercised drain_pending_consent and the
route-level oauth_required emission path. That path is gone — consent
URLs now flow through Strands' _interrupt_state inside
agent.stream_async (stream_coordinator.py:543), and the hook behavior
is covered by tests/agents/main_agent/session/test_oauth_consent_hook.py.
- test_missing_message_returns_422 expected message to be required, but
InvocationRequest.message is now default "" so resume requests can
reuse the original prompt from interrupt context.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(connectors): implement tool-config freshness cache and update related logic
* fix(connectors): resolve OAuth re-auth loop in local dev + tighten 401 detection
Fixes the constant Google re-auth bug: the consent hook was calling
AgentCore Identity with `callback_url=None` whenever the inference API
ran outside the Runtime proxy (every local-dev session). AgentCore then
issued an authorize URL whose redirect went somewhere other than
`/oauth-complete`, so consent never finalized and every request looped
back through the consent flow.
Adds a `CallbackUrlUnavailableError` and an `AGENTCORE_LOCAL_OAUTH_CALLBACK_URL`
env-var fallback in `_resolve_callback_url`, so the failure mode is now
loud instead of silent. Both the chat-triggered consent hook and the
settings-page `initiate-consent` route catch it and return 503 with
actionable guidance.
Also tightens the OAuth 401 detection regex to reduce false-positive
re-auth prompts: `\bunauthorized\b` now requires proximity to an
HTTP/status/code keyword (previously matched prose like "unauthorized
to view this calendar"), and adds high-confidence signals for OAuth
`invalid_grant` (refresh-token revocation) and Google's `UNAUTHENTICATED`
status / `invalid authentication credentials` message.
Drops the in-process `session_cache` defence-in-depth on
`complete-consent`: AgentCore's own `userIdentifier` ↔ `sessionUri`
binding already rejects mismatched completions, and the local cache
cost real operational pain (multi-worker / restart / `--reload` would
break legitimate consent flows with a confusing 403). Trust the
JWT-derived `current_user` plus AgentCore's binding instead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(connectors): user disconnect, status endpoint, and OAuth UX polish
Several user-facing connector improvements that share a foundation
(per-user `force_reauth` lifecycle in the in-process token cache):
- New `GET /connectors/{id}/status`: side-effect-free read that the
settings page uses to render a "Connected" badge without committing
the user to a consent flow (initiate-consent always triggers a
server-side pending session). Honors the `force_reauth` flag — a
just-disconnected user is reported as not connected even if the vault
still holds an unexpired token.
- New `DELETE /connectors/{id}/connection`: best-effort disconnect that
flips the local `force_reauth` flag (AgentCore exposes no per-user
vault-delete API). The next status check returns `connected: false`,
the next initiate-consent passes `force_authentication=True`, and the
user re-authorizes from scratch. complete-consent clears the flag on
success so the UI flips back to connected without waiting on the agent
loop to warm the cache.
- Frontend Disconnect button on connected rows. Confirmation dialog uses
the existing `ConfirmationDialogComponent` (CDK Dialog, destructive
styling) — also swapped the admin connector-list delete from native
`confirm()` to the same component for visual consistency.
- Closed-popup recovery in `OAuthConsentService`: poll `popup.closed`
after open and drop the provider from `inFlight` if the user dismisses
without completing consent. The pending request stays so the chat
banner re-offers Connect; the settings page resets `awaiting` →
`idle` via the new `inFlightProviders` signal.
- Settings page: loading skeleton in the row's action area while the
status probe resolves, dropped the misleading "Reconnect" button
(clicking it just hit `initiate-consent` and toasted "already
connected"), and removed the scope-list display under each connector.
- Forward Google's `access_type=offline` (per AgentCore Identity docs)
via a new vendor-baseline helper, plumbed through both the
chat-triggered consent hook and the settings/initiate-consent /
status routes via two new optional lookups on `OAuthConsentHook`
(`provider_type_lookup`, `custom_parameters_lookup`). Without this
Google issues a 1-hour access token with no refresh path and the
vault entry becomes unrefreshable.
- Admin-configurable `custom_parameters` field on the OAuth provider
record (DynamoDB `customParameters` map, Pydantic Create/Update/
Response, admin form `key=value` textarea with parse/serialize
helpers). Merged with the vendor baseline at request time — baseline
wins on conflict so admins cannot accidentally turn off documented
vendor requirements.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(connectors): add Slack/Salesforce/Zoom presets + dynamic form placeholders
Per the AgentCore Identity supported-providers docs, Slack, Salesforce,
and Zoom are first-class vendors with pre-configured endpoints — admins
only need to supply credentials. Verified the exact `credentialProviderVendor`
strings and `oauth2ProviderConfigInput` keys against the SDK shape
(`Oauth2ProviderConfigInput.members`):
- Slack → SlackOauth2 / slackOauth2ProviderConfig
- Salesforce → SalesforceOauth2 / salesforceOauth2ProviderConfig
- Zoom → ZoomOauth2 / includedOauth2ProviderConfig
(shared key for simpler vendors)
Backend additions: `SLACK`, `SALESFORCE`, `ZOOM` on `OAuthProviderType`;
vendor + config-key entries on the registrar. The existing discovery-URL
guard correctly rejects discovery URLs for these new types.
Frontend additions: matching `ConnectorType` literals; preset entries
with sensible default scopes and vendor-relevant placeholder hints (e.g.
Salesforce `api, refresh_token, offline_access, id, openid`); icon
class branches for the new tiles (Slack fuchsia + chat bubble,
Salesforce sky + cloud, Zoom blue + video camera).
Form polish:
- `scopesPlaceholder` / `customParametersPlaceholder` on each preset.
Form binds them via computed signals so the hints update as the admin
switches between providers.
- Selecting a preset seeds `customParameters` only when the preset
declares `defaultCustomParameters` — avoids clobbering user-typed
content for presets that have only a hint.
- Dropped the Google `defaultScopes`. The OIDC-only
`openid email profile` set doesn't actually let an agent do anything
useful with Google APIs (Calendar/Gmail/Drive each need different
scopes), so the form lands empty and the placeholder shows the URL
format as a hint.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(connectors): add support for optional base64 icon uploads and validation
* feat(connectors): inline OAuth consent prompt + persistence-backed restore
Replaces the floating OAuth banner with an inline prompt anchored to the
assistant turn that triggered consent, and persists pending interrupts to
session metadata so a browser refresh rediscovers them instead of leaving
the tool call orphaned in `pending` forever.
Backend
- New `PendingInterrupt` model on `apis.shared.sessions.models`; included
on `MessagesListResponse` and `SessionMetadata`.
- `metadata.add_pending_interrupt` / `remove_pending_interrupts` /
`get_pending_interrupts` helpers using GSI lookup + targeted UpdateExpression.
- `StreamCoordinator._extract_oauth_required_events` is now async and
persists each interrupt before yielding the SSE event; failures log but
never break the live stream.
- `get_messages_from_cloud` fetches pending interrupts in parallel.
- `/invocations` resume path clears resolved interrupts from metadata
after `agent.stream_async` completes.
- New `DELETE /sessions/{sid}/pending-interrupts/{iid:path}` endpoint
for explicit dismiss; colon-bearing Strands ids preserved via `:path`.
Frontend
- New `OAuthConsentPromptComponent` with a refined inline card design,
connector icon (admin base64 wins over heroicon, falls back to
providerType default), eyebrow/lock motif, primary gradient action
button, hover-revealed dismiss, fade+slide entrance.
- `MessageMapService.loadMessagesForSession` hydrates pending interrupts
on session load; anchors to triggering message id when present, else
the most recent assistant message.
- `OAuthConsentService.openConsentPopup` is async; lazy-fetches a fresh
authorization URL via `initiate-consent` when the stored one is absent
or expired (handles "already consented in another tab" by auto-resuming).
- `OAuthConsentService.dismiss` syncs to backend by default; completion
flow opts out so the resume path's own cleanup isn't double-fired.
- `MessageListComponent` renders unanchored interrupts at end-of-list as
a fallback for the "partial assistant message wasn't persisted" case.
- `awaiting_auth` derived tool status renders as a primary-blue ring on
the tool-rail dot instead of an indefinite amber spinner.
- `ChatRequestService.resumeFromOAuthConsent` accepts a fallback session
id (post-refresh case where `lastRequestObject` is null) and surfaces
400 `Unknown or expired interrupt ids` as a conversational error.
- Old floating `OAuthConsentBannerComponent` removed.
Known follow-up
- First-turn-of-a-new-session OAuth: persistence currently no-ops because
the session metadata row doesn't exist yet when the interrupt fires.
Tracked separately; sidecar item or upsert pattern is the likely fix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(connectors): add OAuth consent prompt component for authorization handling
* feat: enhance session metadata management and update handling
- Add functions to ensure session metadata existence and update session title and activity.
- Implement logic for handling session activity updates, including message count increments and preferences merging.
- Introduce deduplication for pending interrupts to prevent duplicate entries during session updates.
- Update frontend components to reflect changes in session management, including OAuth consent prompts and message handling.
- Refactor session service interfaces to use camelCase for consistency with backend responses.
- Enhance tests for session activity updates, pending interrupts, and ensure proper handling of session metadata.
* fix(connectors): durable OAuth resume across browser refresh
Resume after an OAuth-gated tool call only worked when the in-memory
agent cache still held the original turn. After a browser refresh the
frontend lost its request snapshot and the resume request landed with
no enabled_tools / model_id, so the inference API rebuilt a fresh agent
with an empty external-tool registry — the paused tool call had nothing
to resume against and the LLM responded that the tool wasn't available.
Resume contract now lives server-side. On pause, the stream coordinator
captures a ``PausedTurnSnapshot`` (enabled_tools, model_id, provider,
temperature, system_prompt, caching_enabled, max_tokens) onto the
session row alongside the existing ``pendingInterrupts``. On resume,
the inference API loads the snapshot and rebuilds the agent from it;
Strands' SessionManager then restores ``_interrupt_state`` from
AgentCore Memory, so the paused tool call picks up where it left off
regardless of cache hit/miss, refresh, or pod restart.
Frontend ``lastRequestObject`` snapshotting is gone — the resume
payload is now ``{ session_id, message: '', interrupt_responses }``.
Server-side snapshot has a 1h TTL; cleared on full turn completion
and at the start of any new (non-resume) turn.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(connectors): pre-flight external MCP clients so one bad server can't fail a turn
Previously, ``load_external_tools`` cached newly-created MCP clients
without verifying the server was actually reachable. A single connector
that wasn't running locally (or whose endpoint was misconfigured) would
sit in the registry and fail the whole turn the first time Strands
called ``load_tools()`` on it.
Pre-flight each new client immediately after construction. On failure,
log a warning, skip the tool, and continue — the user keeps their other
tools. On success the call also primes the client's tool cache, so
Strands' later ``load_tools()`` becomes a no-op.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix: Update oauth consent prompt styling
* test(sessions): unblock route tests on new pre-stream metadata hook
ensure_session_metadata_exists() now runs unconditionally on /invocations
and raises when DYNAMODB_SESSIONS_METADATA_TABLE_NAME is unset, breaking
route tests that mock the agent and skip DynamoDB. Stub it via an autouse
fixture so route tests exercise the route, not the persistence layer. Also
patch the new get_pending_interrupts call in the cloud-message tests.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(connectors): make OAuth disconnect intent durable across replicas
The disconnect flag lived in a module-level set inside the inference API
process, so a /disconnect on one replica was invisible to any other.
Under multi-replica deploys the user could see "Connected" on one
request and "needs consent" on the next, and the AfterToolCallEvent
401-retry path likewise lost its intent on replica fan-out.
Move the per-(user, provider) disconnect flag to a new
OAuthDisconnectRepository on the existing oauth-user-tokens DynamoDB
table (already provisioned, KMS-encrypted, with R/W IAM granted to the
inference API). The token cache stays as a hot-path L1 for tokens only;
the consent hook reads the disconnect repo on every BeforeToolCallEvent
so a disconnect anywhere is honored on the next tool run anywhere.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(connectors): validate OAuth callback URL header against CORS allowlist
The frontend posts an `OAuth2CallbackUrl` header on every consent-related
request, and the inference-api middleware was forwarding it verbatim into
`BedrockAgentCoreContext`. An authenticated user could pivot the OAuth
redirect to an attacker-controlled origin and capture the authorization
code on consent. Reuse `CORS_ORIGINS` as the trust boundary, pin the
path to `/oauth-complete`, and reject non-http(s) schemes, query strings,
and fragments.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(connectors): cap OAuth re-auth at one retry per provider per turn
A misconfigured provider (wrong scope, perma-401) would otherwise spawn
a fresh consent prompt on every tool call in a turn: the per-tool-use
retry guard reset for each new toolUseId, so the model could trigger
prompt-after-prompt with no upper bound. Track attempted providers on
the hook itself, reset on `BeforeInvocationEvent` (fires per turn,
including resume), so the user sees at most one consent prompt per
provider per turn before 401s flow through to the model.
Also clarify the `event.interrupt(name="oauth:{provider_id}")` comment:
the SDK's BeforeToolCallEvent._interrupt_id folds in `toolUseId`, so
parallel tool calls to the same provider already produce distinct
interrupt ids. New regression test pins that invariant.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(connectors): drop re-emitted oauth_required events by interrupt id
A stream replay after refresh, or a late server-side breadcrumb clear,
could fire the same `oauth_required` event again after a successful
consent or explicit dismissal — and the prompt would resurrect because
provider-keyed dedup re-added the entry. Track seen interrupt ids on
the consent service so already-resolved interrupts stay gone for the
session. New tool calls always carry a fresh interrupt id (Strands
generates it from `toolUseId`), so legitimate prompts are never
suppressed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(infra): correct stale "App API Stack" comments in inference-api stack
The referenced tables live in InfrastructureStack (moved there to break a
prior circular dep); update 9 SSM-read comments to match.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Remove the "Sync from Registry" admin feature in favor of DynamoDB as the single source of truth for the tool catalog. Code-defined tools are now seeded by the existing bootstrap script (expanded to cover calculator and generate_diagram_and_validate); admins add everything else through the "Add Tool" form. Also drops the in-memory fallback in ToolCatalogService and removes the stale get_current_weather tool. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* fix: source ToolAccessService catalog from DynamoDB ToolAccessService.filter_allowed_tools enumerated tools from the legacy in-memory catalog, so MCP-external and A2A tools added via the admin form (which only persist to DynamoDB) were silently filtered out for wildcard-access users. Wire the service to a new TTL-cached snapshot (freshness.get_all_tool_ids) backed by the DynamoDB tool catalog. Gateway tools keep their prefix- based bypass since they're loaded dynamically at runtime. Admin create/update/delete invalidate the snapshot so changes are visible on the next chat turn in-process. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: hold all-tool-ids snapshot in single-slot list CodeQL flagged `_all_tool_ids_cache` as unused because its only writes were `global` reassignments — flow analysis didn't connect them to the reads. Switch to a one-element list so the slot is mutated in place, matching the existing `_cache` dict pattern in this file. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#179) The first CreateOauth2CredentialProvider call in a region implicitly provisions the `default` token vault, so the AppApi task role needs `bedrock-agentcore:CreateTokenVault` in addition to the provider CRUD actions. Without it, creating the very first connector returned a 500 with `AccessDeniedException` from bedrock-agentcore-control. Also pass `DYNAMODB_OAUTH_PROVIDERS_TABLE_NAME` to the container env. The IAM grant and SSM lookup were already in place; only the env wiring was missing, which caused the OAuth provider repository to silently disable itself and would have failed the DB write after AgentCore succeeded — triggering the orphan-rollback path. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#373) Every file-source endpoint resolves an OAuth token server-side, so app-api needs the `OAuth2CallbackUrl` header its AgentCoreContextMiddleware bridges into BedrockAgentCoreContext. FileSourceService omitted it, so browsing a connected source failed with CallbackUrlUnavailableError (503) right after a successful connect. Add the header to every call, mirroring UserConnectorsService. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…omParameters (#374) The file-source browser surfaced a 409 "not connected" for connectors that were in fact connected. AgentCore Identity factors `customParameters` into whether `get_resource_oauth2_token` short-circuits to a vaulted token: connector consent runs through `initiate_consent`, which sends Google `prompt=consent`, but every retrieval path omitted it — so AgentCore treated the read as a fresh request and reported consent-required despite a usable vaulted token. `resolve_file_source_token` (and `_is_connected`, which delegates to it) and `connector_status` now build `customParameters` with `force_authentication=True` to match the consent flow. The calls remain pure reads — `get_token_for_user` itself stays `force_authentication=False`. Frontend hardening so the dialog owns its error UX: file-source requests opt out of the global error toast via a `SUPPRESS_ERROR_TOAST` HttpContext token, and a 409 in the browser view now renders an actionable Connect button instead of dead-end text. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Generated by the kaizen-research skill. Top 5 ideas appended to docs/kaizen/review-queue.md for the kaizen-review-prep run later this morning.
Generated by kaizen-review-prep. Ranked agenda for the 10-15 min decision pass; queue updated with three review-prep-surfaced friction items (kaizen-research did not run 2026-05-22, so no fresh research doc fed this review).
* docs(skills): capture list/form design conventions in tailwind-ui skill Add references/app-conventions.md documenting the rounded-2xl list and form page design language (border radius, list style, form sections, button variants) so future list/form work in frontend/ai.client matches the redesigned admin pages instead of the older boxed-card style. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(assistants): redesign assistant editor and file-connector UX Restyle the assistant editor and the file-source browser modal to the rounded-2xl list/form design language used by the redesigned admin pages (manage-models, tools), and rework how documents are added: - Editor form column restyled (rounded-2xl, text-sm/6, blue accent, flat border-t sections); uploaded documents rendered as a divide-y list; form column given a bg-gray-50 surface so inputs read clearly. - File-source connectors are surfaced as buttons directly above the drop zone instead of behind a generic "Import from a connector" button — clicking one opens the browser dialog targeted at that connector, skipping the in-modal source picker. - The drop zone collapses to a compact "Add files" control once a document exists or is uploading; device and connector uploads stay available. - File-source browser modal restyled to the same conventions (rounded-2xl panel, divide-y lists, convention button variants). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
When a file-source connector is not connected, the editor button now
labels itself "Connect to {connector}" and kicks off the OAuth consent
popup in place — users no longer have to open the modal just to be
presented with a Connect prompt.
- Editor injects UserConnectorsService and OAuthConsentService, mirroring
the dialog's connect flow.
- Button shows in-place busy states (Starting… / Awaiting consent) and
disables itself while the popup is in flight.
- On consent success the file-source list is refreshed and the browser
modal opens automatically into the newly-connected connector, so one
click takes the user from "not connected" to picking files.
- If the user closes the popup without consenting, the button resets so
they can try again.
Spec gains UserConnectorsService and OAuthConsentService mocks so the
component can still construct in tests.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…378) Adds an "Add web content" flow alongside the existing connector imports in the assistant editor. Single-page mode (default) and bounded BFS crawl mode share one dialog; the backend writes extracted markdown to the documents bucket so the existing S3-event ingestion Lambda chunks and embeds it exactly as a device upload would. Backend - New `apis/app_api/web_sources/` package (models, routes, crawler, repo, url_utils). Endpoints under `/assistants/{id}/web-sources/`: `POST /crawl`, `GET /crawls?active=true`, `GET /crawls/{id}`. Uses `get_current_user_from_session` per the auth-dependency rule. - BFS crawler: per-host jitter, bounded concurrency, robots.txt-respecting, same-domain, SSRF-guarded, 5 MB per-page cap, 15-minute crawl budget, always-finalize-on-exit. trafilatura → markdown with BS4 fallback. - `CrawlJob` rows persisted in the assistants table via the adjacency-list pattern (`SK=CRAWL#{crawl_id}`). Floats coerced to `Decimal` before put_item (DynamoDB rejects bare floats). Terminal rows get a 30-day TTL and cascade-delete when the last web doc for that root is removed. - Cleanup cascade: `cleanup_document_resources` now reaps orphaned terminal `CrawlJob` rows after deleting a web doc. - Self-heal: `list_active_crawls` auto-finalizes any `running` row older than 20 minutes (mirrors the stale-doc auto-fail pattern), so a crashed process can't leave the SPA in perma-poll. - Crawler holds strong refs to worker tasks; the route holds a module-level set of in-flight crawl tasks (Python's weak task tracking would otherwise GC them mid-execution). - New deps: beautifulsoup4 4.13.5, trafilatura 2.0.0. Frontend - New `WebSourceDialogComponent`: URL input + "Crawl linked pages" toggle revealing depth / max-pages / concurrency / delay sliders. Submit-and-watch UX — modal closes on Start, pages appear in the docs list as they're ingested. Style tokens match the file-source dialog. - `WebSourceService` thin client for the three endpoints. - Editor wiring: "Add web content" button next to the connector buttons, with an inline "Crawling…" badge while a crawl is in flight. - Crawl watcher polls `/web-sources/crawls?active=true` every 5 s; new pages surface via an incremental discovery merge — no list-wide refresh. - Document delete: optimistic UI removes the row immediately and rolls back on failure (no more wait-then-disappear). Stale-uploading docs can now be deleted regardless of polling state. Tests - `tests/apis/app_api/web_sources/` (60 tests): URL normalization, same-domain, SSRF guard, BFS bounds, robots, per-page failure handling, crawl finalization, route 202/404/422/401, CrawlJob put/get/list round trip with float delays, stale-row reaper, cascade cleanup, TTL on finalize. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…letons (#379) Collapses the three separate "add knowledge" groups (Add files, Add web content, connector buttons) into a single inline action row under the renamed "Knowledge base" section. Order is Add files → Add web content → connector chips so the most-common action sits first. - New `fileSourcesLoading` signal renders width-matched skeleton chips while the connector catalog loads, so the row's final layout is previewed instead of buttons popping in after a network round-trip. - Drop zone is preserved as the empty-state drag-drop affordance, but trimmed to drag-only (the inline "Add files" chip is the only click-to-pick entry point now). - "Crawling…" badge moved up next to the section heading so the action row stays clean. - Fixes a pre-existing duplicate `id="file-upload"` (the drop zone and the compact button each rendered their own input with the same id — invalid HTML + an a11y violation). Now a single hidden input shared by every label that needs it. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds a download button next to the delete button in the assistant
editor's uploaded documents list. Visible only for documents in the
`complete` status. Reuses the existing `GET /assistants/{id}/documents/
{docId}/download` endpoint and mirrors the citation-display download
pattern (presigned URL opened in a new tab).
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
) The assistant editor preview now hides voice mode and the settings button — neither is wired to anything useful inside the editor — while exposing file attachments so authors can test their assistant against the same inputs end users will send. File uploads flow through the existing /chat/stream proxy as file_upload_ids. Splits the chat-input settings button into its own showSettingsControl gate (previously coupled to showFileControls), adds a parallel showVoiceControl, and threads both through ChatContainerConfig. Renames the chat-input file input id to chat-input-file-upload to avoid colliding with the assistant form's knowledge-base file input on the editor page. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
When a consumer chats with an Assistant (request carries rag_assistant_id), the agent now runs with zero external tools and is steered to answer from the knowledge base context that's already pre-stuffed into the prompt. Editor-side connectors are unchanged — owners still curate the KB by pulling in Drive docs, web crawls, and uploads. The product split this codifies: the main agent is the general-purpose, tool-using surface; assistants are the grounded, predictable, citation- friendly surface. Two clear mental models instead of one muddy one. Three layers: * Inference API route (inference_api/chat/routes.py) forces input_data.enabled_tools = [] inside the rag_assistant_id branch before the agent is built, and warns if the client sent a non-empty list. This is the real enforcement chokepoint — covers SPA, API-key, and any future caller through /invocations. * System prompt composition gains a "## Knowledge Base Grounding" section inserted between the base prompt and the owner's "## Assistant-Specific Instructions". Owner instructions still come last and take precedence; the directive just tells the model to ground in provided context and acknowledges no external tools are available. Applies in both the with-instructions and no-instructions paths. * The SPA chat-request builder omits tools (sends enabled_tools: []) when assistantId is present. Cosmetic given backend enforcement but saves payload bytes and matches the existing preview-chat behavior. MCP App UI events fall out for free: no tools means no tool_result events, which means no ui_resource events to gate. Existing assistants don't need migration — beta. Voice and other consumer surfaces will be addressed separately if/when they need to reach the same contract. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…113) (#383) Extends sharing to support per-user permission levels so teams can co-edit assistants. Owners can grant edit access to specific people; editors can update settings, manage documents, and test-chat — but cannot delete the assistant, change visibility, or manage the share list. Backend only — the frontend share UI and per-assistant edit gating land in a follow-up PR. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* feat(assistants): viewer/editor share permissions UI (#113) Consumes the backend contract shipped in #383: surfaces a per-share permission toggle in the share dialog, an "Editor" badge + Edit affordance on shared-with-me cards, an editor banner on the form, and an owner-only gate on the Share button. - Dialog: per-row "Can view / Can edit" select on existing shares, a "permission for new people" toggle, and onSave delta-detection that distinguishes adds, removes, and permission changes on already-shared emails — each dispatched to the correct backend endpoint (POST / DELETE / PATCH). - List: shared-with-me cards with userPermission='editor' now show an Editor badge and an Edit button alongside Chat. - Form: surfaces "Shared by {owner}" banner for editors; Share button is owner-only. - AssistantSharesResponse.sharedWith is now ShareEntry[] across the service / api / dialog (matches the backend's PR-1 shape change). - Vitest: existing service spec migrated to the new shape; new dialog spec covers the delta algorithm (adds / removes / permission upgrades / mixed) via DI tokens per project convention. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(share-dialog): match redesign tokens + skeleton for user search Brings the share dialog in line with the list/form design language captured in .claude/skills/tailwind-ui/references/app-conventions.md: rounded-2xl, blue accent (was indigo), focus:ring-2 focus:ring-blue-500, dark:bg-gray-800 inputs, flat <section> blocks divided by border-t. - Header avatar: blue-100 chip (was indigo). - URL row + add-people + current-shares are three flat sections, no individually-bordered cards. - "Currently shared with" became a single rounded-2xl divide-y ul (was a stack of bordered rows), with an empty-state when nothing is shared. - Mode toggle is now a proper segmented tablist with role="tab" and aria-selected, accented in blue. - Permission default-for-new-people moved into the section header so it's contextually attached to the add controls. - Save/Cancel actions reorder cleanly on mobile + desktop and use the shared blue/ghost button tokens. - Search results render skeleton rows (animate-pulse) while searching, with role="status" sr-only text — matches the connector-chip skeleton pattern already in the assistant editor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(share-dialog): add skeleton for the currently-shared list loadShares() runs in the constructor and shares() starts empty, so the empty-state ("Not shared with anyone yet") was painting while the fetch was in flight on dialogs for SHARED assistants. Renders a skeleton ul matching the real row layout (email + permission select + delete) while loadingShares() is true; suppresses the count chip until the fetch resolves. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(share-dialog): single-row shares skeleton + stronger tab accent - Shares skeleton: drop from 3 rows to 1; the loading state was visually heavier than the real list it was previewing. - Tabs: selected state now flips font-medium → font-semibold alongside the existing blue underline + text color, so the active tab reads at a glance. -mb-px on each tab makes the active 2px underline overlap the container's 1px bottom border cleanly (no gray line peeking through). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(share-dialog): more right padding on permission selects The native chevron sat too close to the rounded-2xl edge. Switch px-2.5 → pl-2.5 pr-7 on both permission selects (the per-share row select and the "default for new people" select) so the chevron has breathing room without changing the left padding. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(share-dialog): make tab underline win the border-color cascade The active tab used border-blue-600 alongside a base border-transparent. Both are border-color utilities at the same specificity, so whichever Tailwind emitted later in the stylesheet won — in practice the transparent base, leaving no visible underline. Switch to bottom-only border-b-transparent / border-b-blue-600 so the active state targets border-bottom-color exclusively; no cascade collision with the base. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(share-dialog): tab accent via aria-selected variant Confirmed via devtools: border-b-blue-600 was on the DOM but computed border-bottom-color was rgba(0,0,0,0). Same-specificity class collision with border-b-transparent — Tailwind's emit order put the base last and the conditional class lost the cascade. Move all active styling onto aria-selected:* utilities so the active selector becomes [aria-selected="true"], which has attribute-selector specificity and beats the base. As a bonus the tabs now use one declarative class string instead of four parallel [class.x] bindings. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(share-dialog): custom chevron on permission selects Native select chevrons sit at a browser-fixed offset from the right edge regardless of padding-right, so pr-7 / pr-8 just pushed the text away from a chevron still crowded against the rounded-2xl corner. Switch both permission selects to appearance-none + an overlaid heroChevronDown icon. The wrapper handles positioning so the chevron clears the rounded corner cleanly. pointer-events-none on the icon so clicks still hit the native select. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(skills): capture tab-accent and select-chevron gotchas Both fell out of the share-assistant-dialog redesign on PR #384 and will bite future work the same way if undocumented: - Tabs: conditional [class.border-b-blue-600] loses the cascade to a base border-b-transparent at the same specificity. Use the aria-selected: variant so the active rule has attribute-selector specificity. - Selects: native chevrons sit at a browser-fixed offset from the right edge regardless of padding-right; with rounded-2xl they crowd the corner. appearance-none + overlaid heroChevronDown gives reliable positioning. Adds a Tabs subsection, a Select example under Form pages, and a "Common gotchas" section that names both failure modes and their DevTools symptoms. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds LibreChat (danny-avila/LibreChat) as external source #12 in the kaizen-research skill. Releases-first, 1-2 web requests/week, covers four lenses: UI/UX patterns, comparable-platform choices, MCP integration patterns, and release-only signal. Bumps the subagent fan-out count to 14 categories and surfaces LibreChat in the skill description so it shows up in trigger context. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Silences four NG8113 build warnings — RouterLink was imported and listed in `imports:` on these admin pages but never referenced in their templates. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…387) Adopt the rounded-2xl / border / text-sm-6 / primary-500 focus vocabulary used by the canonical admin list/form pages so the chat settings panel no longer reads as a separate visual generation. Keeps primary-* as the accent and leaves the slide-over chrome (backdrop, transform animation) untouched. Also fixes the native <select> chevron crowding the rounded-2xl corner on the effort enum by switching to appearance-none + an overlaid heroicon. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
The stream_processor's `_format_force_stop_message` already produces friendly user-facing markdown for known Bedrock force_stop patterns (document size, throttling, access denied) — fully formed with a "⚠️ " prefix and actionable guidance. The in-loop error handler in stream_coordinator then ran that text through `build_conversational_error_event`, which for `AGENT_ERROR` fell into the generic else branch and wrapped it again in:⚠️ Something went wrong. > {already-friendly text} Please try again. Result: two⚠️ markers, the friendly text trapped in a blockquote, and a ceremonial "Please try again." appended. Detect the already-classified case by the leading⚠️ on AGENT_ERROR messages and emit a `ConversationalErrorEvent` directly with the classifier's text intact. The unclassified "Agent force-stopped: {raw}" fallthrough has no warning prefix and still flows through the generic wrapper, so unrecognized errors keep their friendly wrapper. Live SSE display and refresh-hydration both read `conv_error_event.message`, so they stay in sync. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…esh (#389) * fix(streaming): persist synthetic error messages so they survive refresh When a Bedrock ValidationException fired (e.g. gpt-oss-120b + an attached document), users saw an error in the chat live but the message disappeared on refresh. Two root causes, plus follow-on cleanup. 1. Always-false persistence guard. Three sites guarded create_message behind `hasattr(session_manager, "base_manager")`. The current SDK exposes create_message directly on AgentCoreMemorySessionManager — no nested wrapper — so the guard was always False and every synthetic write was silently skipped (no log, no exception). Extracted a single helper `persist_synthetic_messages` that asserts the real SDK contract and logs loudly when create_message is unavailable. Replaced the three copies in stream_coordinator.py (2x) and chat/routes.py (1x). The quota-exceeded path in chat/routes.py was broken the same way and is now working as a side effect. 2. Duplicate user-turn write. The streaming error paths re-persisted the user turn even though Strands' MessageAddedEvent hook already wrote it at turn start. The conflicting second write caused AgentCore Memory to reject (in practice) or duplicate (in theory), dropping the assistant error message along with it. The streaming paths now persist assistant-only, matching the documented MAX_TOKENS reasoning. 3. Misclassified error copy. `_format_force_stop_message` was matching `ValidationException + "document"` as a 4.5 MB size overflow — but the raw string `"This model doesn't support documents"` also contains "document", so unsupported-modality errors were getting the wrong message. Added explicit branches for "doesn't support document(s)" and "doesn't support image(s)" before the size check; narrowed the size matcher to size-specific markers. 4. User-facing copy revised. Dropped brand names ("Claude or Nova"), non-actionable UI suggestions ("remove the attachment"), specific UI affordance references ("gear icon next to the message input"), and the Spreadsheet Analysis hint (not guaranteed enabled across deployments). Tests now assert these things are NOT present so regressions fail loudly. Adds: - agents/main_agent/session/persistence.py — single persist helper - tests covering the helper, the classifier branches, and the persist contract end-to-end All 161 streaming + persistence tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(streaming): collapse redundant AGENT_ERROR persist branch After merging develop's #388 (stop double-wrapping classified force_stop errors), conv_error_event.message holds the un-wrapped friendly text for classified AGENT_ERROR cases — same string the content_block_delta yields to the live SSE stream. The branch picking between error_message and conv_error_event.message based on error_code is now functionally a no-op for the classified path and a fix for the unclassified path (the live "Agent force-stopped: …" wrapped template now also gets persisted instead of the bare reason, keeping live and refresh-hydrated views in sync). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
… outer except (#390) Follow-up cleanup after #388 and #389 landed the persistence + double-wrap fixes. Three small architectural smells in the error-handling layer: 1. Dead classifier in stream_coordinator's outer except. #389 added _format_force_stop_message to the path 3 except handler on the theory that Bedrock ValidationException (gpt-oss-120b + document) bypasses stream_processor's force_stop branch via a GeneratorExit and escapes here. The trace path in test_force_stop_persistence.py:217-235 shows it actually doesn't — process_agent_stream's own outer except catches stream_async exceptions first and yields STREAM_ERROR events, which the in-loop handler picks up. Path 3 only fires for failures in the coordinator's own loop body (interrupt extraction, artifact lookup, metadata calc), which don't carry Bedrock-y patterns the classifier could match. Removed the classifier branch and dropped the now-unused import. 2. Tightened comment on the RuntimeError vs Exception split. The split in stream_processor's outer except is load-bearing — the "generator"/"async" matched branch silently swallows async-generator state errors (e.g. "asynchronous generator is already running", "generator ignored GeneratorExit") that shouldn't surface as user-visible error events. Normal shutdown goes through the GeneratorExit handler above. Kept the structure, made the load-bearing part obvious in the comment. 3. Canonical-reference pointer in persistence.py docstring. The "user turn already persisted by MessageAddedEvent hook" invariant is documented at the persist call sites and in the helper's messages argument docstring. Added a top-level pointer so future readers know where the single source of truth lives. Tests: 161 streaming/persistence + 26 chat tests pass. No behavior change for any covered path. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…nned (#391) * feat(devcontainer): add reproducible dev container Adds a single-stage development image with every toolchain needed to build, test, lint, deploy, and end-to-end-test every stack in the monorepo: • Backend — Python 3.13 (managed by uv 0.7.12) + pytest, ruff, mypy, black • Frontend — Node.js 22.22.3 LTS + npm 11.2.0 + Angular CLI (project-local) • E2E — Playwright 1.59.x chromium runtime libs + xvfb + fonts • Infra — AWS CDK 2.1120.0 (matches infrastructure/package.json) • Cloud — AWS CLI v2.34.40 (sha256 + PGP) + Docker CLI 29.4.3 (static) Reproducibility posture (matches backend/Dockerfile.* conventions): • Base image pinned by multi-arch sha256 OCI image-index digest (ubuntu:24.04@sha256:c4a8d5503dfb…) • Every artifact downloaded from the network is verified against a sha256 embedded as a build ARG, OR via PGP signature (AWS CLI installer) • uv copied from ghcr.io/astral-sh/uv:0.7.12 by sha256 — matches backend/Dockerfile.app-api and backend/Dockerfile.inference-api • Multi-arch build: TARGETARCH selects amd64 vs arm64 SHAs Also includes: • .devcontainer/devcontainer.json — VS Code Dev Containers config • .devcontainer/aws-cli-public-key.gpg — AWS CLI Team PGP public key (Key ID A6310ACC4672475C, valid until 2026-07-07) • .devcontainer/README.md — usage, upgrades, Docker-in-Docker caveats Verification of the build was not possible in the authoring environment (no docker/podman available); next step is a docker buildx build for both linux/amd64 and linux/arm64 followed by the verification commands listed in .devcontainer/README.md. * docs(devcontainer): document docker GID quirk in steering, drop devcontainer.json The Dockerfile bakes its docker group at GID 999 (Debian/Ubuntu default). WSL2 with Docker Desktop uses 1001, which made docker-in-docker fail with 'permission denied' on /var/run/docker.sock the first time we tested it. Updates .kiro/steering/dev-environment.md to: • Reference the new agentcore-devcontainer image and .devcontainer/Dockerfile instead of the previous personal docker-compose-based devcontainer • Document the docker GID gotcha with a host-by-host action table • Show the canonical 'docker run' line that auto-resolves the host's docker GID via "--group-add $(getent group docker | cut -d: -f3)" • Spell out the workspace-path map (WSL host vs nspawn vs container) so agents pass the right path to '-v' bind-mounts • Switch to inclusion: always so every agent in the repo always knows which environment to execute commands in Updates .devcontainer/README.md the same way for human readers — adds a 'The Docker GID Gotcha' section, swaps the build/run examples for ones that auto-resolve DOCKER_GID, drops the VS Code Dev Containers section. Removes .devcontainer/devcontainer.json. The team isn't on VS Code, and without a consumer the file was dead weight. * docs(steering): switch dev-environment.md to manual inclusion Not every contributor uses the dev container — some run toolchains directly on their host. Always-include would push container-only execution rules onto sessions where they don't apply. Manual inclusion lets contributors who do use the container reference the doc when they need it without affecting everyone else. * feat(devcontainer): add docker buildx CLI plugin (v0.30.1) Without the buildx plugin, Docker 23+ refuses to build any Dockerfile that uses BuildKit-only syntax — and every project Dockerfile in this repo uses 'RUN --mount=type=cache,target=/root/.cache/uv' for fast uv-cached dependency installs. Symptom before this change, when running scripts/stack-app-api/build.sh inside the dev container: Step 7/25 : RUN --mount=type=cache,target=/root/.cache/uv ... the --mount option requires BuildKit. Refer to ... And with DOCKER_BUILDKIT=1: ERROR: BuildKit is enabled but the buildx component is missing or broken. The fix installs the upstream docker/buildx static binary as a CLI plugin under /usr/libexec/docker/cli-plugins/docker-buildx, which 'docker build' auto-discovers and uses on Docker 23+. Pinned to v0.30.1 with sha256s for both linux-amd64 and linux-arm64, sourced directly from the docker/buildx GitHub release. Same posture as every other downloaded artifact in this Dockerfile. Verified by running scripts/stack-app-api/build.sh end-to-end inside the rebuilt dev container — 1m8s build, produced devtest-app-api:latest at 797 MB, image starts and imports fastapi/uvicorn/strands cleanly with APP_VERSION=1.0.0-beta.27 baked in. --------- Co-authored-by: Kiro <kiro@boisestate.ai>
Adds a manual GitHub Actions workflow and script to tear down all infrastructure. Stacks are destroyed in parallel (Phase 1) with InfrastructureStack destroyed last (Phase 2) since all others depend on it. Safety: requires typing DESTROY to confirm, uses environment-scoped credentials, and is manual-trigger only (workflow_dispatch). Co-authored-by: Colin <colin@boisestate.edu>
* feat(admin): curated model catalog with provider logos Adds a curated model catalog landing page so admins can one-click add fully-configured Bedrock models (Claude Haiku/Sonnet/Opus 4.x) with pricing, modalities, and per-param specs already filled in. The "Add" flow opens a role-picker dialog for per-deployment role IDs before POSTing; "Preview & customize" hands the template to the model form via a prefill service. The list-page "Add model" CTA now routes through the catalog. Each card carries a light/dark provider logo (Anthropic, Amazon, Meta, OpenAI) for brand recognition; OpenAI/Gemini tabs render a "Coming soon" empty state until curated entries land. Documents the @angular/cdk/dialog conventions used here in the frontend CLAUDE.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(provider-logos): add dark and light SVG logos for OpenAI * chore(model-catalog): sync OpenAI logo path to openai folder The provider logo folder was renamed from open-aI to openai; update the PROVIDER_LOGO_DIR mapping so the OpenAI tile resolves once curated OpenAI entries land. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(model-catalog): finalize Opus 4.7 entry with real ID, pricing, params Replaces the placeholder inference-profile ID with the real `global.anthropic.claude-opus-4-7` alias and corrects pricing to the actual $5/$25 per 1M tokens (cache $6.25 write / $0.5 read). Narrows the supported-params surface to `max_tokens` + `effort` — Opus 4.7 exposes effort control instead of explicit thinking/temperature/top_* knobs. Drops the now-obsolete TODO comment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…te (#394) Three threads from the same session, all rooted in the Sonnet 4.6 catalog entry not adding cleanly: 1. SupportedParams._check_thinking_invariants rejected float budgets, but DynamoDB roundtrips ints through Decimal -> float, so any stored model with a `thinking.default` failed model_validate on read. The list endpoint silently skipped invalid rows while the create endpoint's GSI key check still found them, producing "already exists" on POST + an invisible row in the list. Validator now accepts whole-number floats (coerces to int) and applies the same tolerance to the max_tokens comparison. 2. The curated Sonnet 4.6 template had thinking.default == max_tokens default (both 8192), which failed the "budget < max_tokens" invariant on create. Dropped thinking.default to 4096 to match Haiku 4.5. Also bumped the inference profile to claude-sonnet-4-6 and dropped temperature default to 0.7. 3. Manage-models page had no loading state and used native confirm() for delete. Added a spinner + error/retry block mirroring the connector-list pattern, and a new DeleteModelDialogComponent following the AddCuratedModelDialog token convention (alertdialog, red destructive action, Escape/backdrop/Cancel all converge). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…395) Match the redesigned list-page token set on the model edit/create form (rounded-2xl, text-2xl/8 h1, ring-2 focus, dark:bg-gray-800 inputs) and drop the heavy section cards in favor of flat sections divided by border-t — same shape as the redesigned tool-form. Adds an isLoading signal so the form shows a spinner card during the edit-mode fetch instead of flashing an empty form. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…et downloads Adds 25 MB hard-fail and 10 MB soft-warning thresholds (env-tunable via ANALYZE_MAX_FILE_SIZE_BYTES / ANALYZE_WARN_FILE_SIZE_BYTES). The check runs before _download_file using the size_bytes already on file_info, so oversize files never hit S3 GetObject or base64. A logger.warning fires at module load when the thresholds are misconfigured (warn >= max). Soft warning is attached to both success and error responses for files in the 10-25 MB range. Docstring updated with the new safety limit. Closes #258 (items 1+2). Streaming (item 3) tracked separately.
…et downloads (#397) Adds 25 MB hard-fail and 10 MB soft-warning thresholds (env-tunable via ANALYZE_MAX_FILE_SIZE_BYTES / ANALYZE_WARN_FILE_SIZE_BYTES). The check runs before _download_file using the size_bytes already on file_info, so oversize files never hit S3 GetObject or base64. A logger.warning fires at module load when the thresholds are misconfigured (warn >= max). Soft warning is attached to both success and error responses for files in the 10-25 MB range. Docstring updated with the new safety limit. Closes #258 (items 1+2). Streaming (item 3) tracked separately.
…d' (#398) The denominator in the selection counter was always 20 (MAX_SELECTION) regardless of how many conversations the user has. A prior fix introduced selectableCount to cap it at the loaded session count, but with lazy loading we never know the true total — the number jumps as the user loads more, making it misleading. Removes the denominator entirely. The counter now reads '2 items selected' instead of '2 of 10 selected'. Also removes the now-unused selectableCount computed signal. Closes #163
…size error (#403) * fix(file-upload): fix duplicate document name error misclassified as size error (#400) - stream_processor.py: narrow size-limit classifier to require explicit size markers (too large, exceeds, document size); add specific 'duplicate document name' branch above it so the correct message is shown instead of the misleading 'file too large' message - errors.py: add override in build_conversational_error_event for the raw-exception path (Path B) so the clean actionable message is shown regardless of which error code path the exception takes - turn_based_session_manager.py: strip document blocks from history during compaction (same treatment as images), preventing the underlying duplicate-name condition from occurring when a user returns to the same session and re-attaches a same-named file - prompt_builder.py: deduplicate document names within a single turn using a counter suffix (_2, _3) as belt-and-suspenders - routes.py: deduplicate files across the files + file_upload_ids merge paths before partitioning Fixes #400 * fix(file-upload): make document byte stripping unconditional, not compaction-gated Extract document block byte-stripping from _truncate_tool_contents into a dedicated _strip_document_bytes method called unconditionally from initialize(), before the compaction-enabled gate. _truncate_tool_contents is Stage 1 compaction and only runs when AGENTCORE_MEMORY_COMPACTION_ENABLED=true. Stripping document bytes is a correctness fix (prevents duplicate document name ValidationException), not a compaction optimization — it should run regardless of whether compaction is configured.
…on opt-in
Adds a "Conversation Modes" feature: admins manage a catalog of custom
system prompts ("Guided Learning", "Concise", "Caveman", etc.); users
opt in per conversation via the model-settings panel. The active mode
is appended to the base system prompt at invocation time.
Backend
- New apis.shared.system_prompts module: models / repository / service
layered to mirror the user_menu_links convention. Snake_case wire
format end-to-end. Optimistic concurrency on update so concurrent
delete + edit can't resurrect a deleted prompt.
- New admin CRUD routes at /admin/system-prompts (full prompt_text
visibility) and a user read endpoint at /system-prompts (name +
description only - prompt_text is server-side only).
- New DynamoDB table provisioned by InfrastructureStack with PK/SK
schema, point-in-time recovery, AWS-managed encryption. App API
granted full CRUD; Inference API granted GetItem only.
- Inference path resolves the active prompt via a new
system_prompt_resolver module. Gating skips on resume, continuation,
preview, and assistant-attached turns (assistants are KB-grounded
and a "mode" prompt could contradict their instructions).
- Selection precedence: request body first (so first-turn-of-new-session
works without a metadata round-trip), session preferences as fallback
(resume / refresh / new-device). Resolver mirrors the request-supplied
id back onto session preferences via a focused
set_selected_prompt_id helper that does a targeted SET (no SK
rotation, no messageCount bump).
- UpdateSessionMetadataRequest uses null-as-clear semantics on
selectedPromptId via Pydantic's model_fields_set, replacing the
short-lived clearSelectedPrompt flag.
Frontend
- Lazy-loaded SystemPromptsService driven by BFF auth state; loader
body wrapped in untracked() so the loader's signal traffic doesn't
retrigger the auth effect (fixes the load loop seen on side-menu open).
- Active prompt is bound to a session id locally so cross-session
navigation resets cleanly and home-page selections survive the
transition to a brand-new session on first submit.
- Centralised setActivePrompt(sessionId, id|null) for both the
settings panel and the chat-input chip; ToastService surfaces failed
clears instead of swallowing them in the console.
- New admin pages for managing the catalog (list + form), and a
per-conversation chip + radio group in the settings panel.
- Active prompt id is forwarded on every non-assistant submit so the
inference path doesn't depend on metadata write timing.
Tests
- 32 system_prompts tests (model, repo, service, admin + user routes)
covering snake_case wire format, Literal status validation, and
TOCTOU-safe update.
- 5 sessions update tests including null-clear, omit-leaves-unchanged,
and disabled-prompt rejection.
- 15 resolver tests covering gating, composition format, request-id
precedence, persistence side-effect, and exception swallowing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds Conversation Modes: admins manage a catalog of custom system prompts (Guided Learning, Concise, Plan-First, Devil's Advocate, Citation-Strict, Caveman, etc.), users opt in per conversation from the model-settings panel. The active mode is appended to the base system prompt at invocation time and persists across sessions, refreshes, and devices.
Why
Assistants cover the "scoped to a body of knowledge" use case. They don't cover behavioural overlays — how the assistant works on whatever topic you're already discussing. Conversation modes fill that gap:
Topic-agnostic; one mode works across any conversation.
Mid-conversation toggle without losing context.
Admin-curated so the catalogue stays consistent across the institution.
Users see name + description only — prompt text is server-side, never exposed.
What's in the box
For admins — /admin/system-prompts CRUD UI: create, edit, enable/disable, delete prompts. Status defaults to enabled; disabled prompts are hidden from users but kept for audit.
For users — A new "Conversation Mode" radio group in the model-settings panel and a dismissable chip in the chat input. Selection persists per-conversation and survives refresh.
For the inference path — A new system_prompt_resolver module that gates on resume / continuation / preview / assistant-attached turns and appends the prompt with a consistent ## Active Mode: header.
How it works
admin → /admin/system-prompts (CRUD)
└─ DynamoDB: PROMPT# / METADATA
user → /system-prompts (read; enabled-only, name+description only)
→ settings panel selection
└─ frontend signal
_activePromptId(bound to session id)└─ on submit: forwarded as
selected_prompt_idon InvocationRequestinference path
→ resolver: request id > session metadata > none
→ mirrors id back onto session preferences for resume / refresh
→ appends prompt to base system prompt
Selection precedence
Request body (current turn's choice) — handles the first turn of a brand-new session before any metadata row exists.
Session preferences (persisted) — handles resume, refresh, and new-device flows.
Skipped on resume (snapshot owns the prompt), continuation (would invalidate prompt caching), preview (live form edits drive it), and assistant-attached turns (assistants are KB-grounded with their own instructions).
Wire format
Snake_case end-to-end, matching the user_menu_links convention. selected_prompt_id uses null-as-clear semantics on the BFF metadata PUT, leveraging Pydantic's model_fields_set to distinguish "explicitly cleared" from "field omitted — leave unchanged".
Infrastructure changes
New DynamoDB table provisioned by InfrastructureStack: -system-prompts, PAY_PER_REQUEST, point-in-time recovery, AWS-managed encryption.
SSM parameters //admin/system-prompts-table-{name,arn}.
App API IAM: full CRUD on the table.
Inference API IAM: dynamodb:GetItem only.
Tests
32 system_prompts tests — model, repository, service, admin + user routes; covers snake_case wire format, Literal status validation, TOCTOU-safe update.
5 sessions update tests — null-clear, omit-leaves-unchanged, disabled-prompt rejection.
15 resolver tests — gating matrix, composition format, request-id precedence, persistence side-effect, exception swallowing.
666 backend tests pass. Frontend typecheck and Angular build clean.
Manual verification
Create / edit / disable / delete prompts as admin
Pick a mode on the home page; submit; verify prompt is applied to first turn (no metadata round-trip needed)
Pick a mode on /s/; refresh; verify chip rehydrates from persisted preferences
Switch sessions; verify chip resets and doesn't leak across conversations
Dismiss chip; verify next turn doesn't apply the prompt
Disable a prompt admin-side; verify a user with that prompt selected silently falls back (no error)
Known limitations / follow-ups
Concurrent-write window on set_selected_prompt_id: same Read-Modify-Write on the preferences map that update_session_activity already has. Documented in code. Proper fix is nested-attribute SET, which requires pre-creating the preferences map everywhere first — tracked separately.
Sub-200ms clear race: a user who clears a prompt and submits within the BFF persist round-trip can have one stale "mode applied" turn. Bounded blast radius, self-healing, deliberately not fixed by widening the wire protocol.
Pre-existing enabled_tools falsy bug (unrelated): if request.enabled_tools treats [] as "don't update," so a user disabling all tools never persists. Worth a separate ticket.
Operational follow-up after deploy
The temp DynamoDB table created manually before CDK provisioned the real one needs to be deleted post-deploy. Steps:
Confirm -system-prompts exists post-deploy and SSM params resolve.
Smoke-test admin CRUD round-trip end-to-end.
Migrate any seeded prompts from the temp table (small enough to re-create through the UI).
aws dynamodb delete-table --table-name .
Out of scope
Mode stacking (Concise + Code Reviewer, etc.) — single active mode per session for v1.
Role-scoped modes (visible_to_roles) — would let modes be scoped to faculty / students / staff. Worth adding once role data is everywhere it needs to be.
Department-level catalogues — central catalogue only for v1; could grow into per-college / per-department admin scoping later.