Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
c558d89
docs(spec): add stack-architecture-simplification spec
colinmxs May 20, 2026
f485c26
chore(specs): remove stack-architecture-simplification spec
colinmxs May 20, 2026
ca29168
test(infra): capture legacy 9-stack synth baseline + normalization ut…
colinmxs May 20, 2026
75fc2b2
test(infra): scaffold constructs/ tree + equivalence harness skeleton
colinmxs May 20, 2026
3a61fa3
refactor(infra): lift InfrastructureStack into 18 reusable constructs
colinmxs May 20, 2026
f9bb0ab
refactor(infra): lift FrontendStack into 3 reusable constructs
colinmxs May 20, 2026
13222e0
refactor(infra): lift ArtifactsStack and McpSandboxStack into 5 const…
colinmxs May 20, 2026
608c4eb
refactor(infra): lift RAG and Fine-Tuning data resources into constructs
colinmxs May 20, 2026
58ef3dc
refactor(infra): extract AssistantsTable into reusable construct
colinmxs May 20, 2026
290c0fd
refactor(infra): extract AgentCoreGateway into reusable construct
colinmxs May 20, 2026
f032a91
refactor(infra): lift remaining compute into Backend constructs
colinmxs May 20, 2026
3fd8cdd
feat(infra): implement PlatformStack + BackendStack (Phase 3)
colinmxs May 21, 2026
604fda7
feat(infra): Phase 4 — build pipeline, workflows, legacy deletion
colinmxs May 21, 2026
8ac0067
chore(infra): remove feature flags — deploy everything always
colinmxs May 21, 2026
c128580
feat: add restore-data tool + workflow
colinmxs May 21, 2026
638f8e6
merge: integrate origin/develop + port dynamic CSP to construct
colinmxs May 22, 2026
c0b73c0
refactor(infra): extract App API IAM grants into focused module
colinmxs May 22, 2026
902f5de
refactor(infra): decompose app-api monolith — 1736 → 268 lines
colinmxs May 22, 2026
3b3b6f2
refactor(infra): decompose inference-api monolith — 1513 → 794 lines
colinmxs May 22, 2026
0715472
test(infra): add 135 tests — PlatformStack, BackendStack, per-constru…
colinmxs May 22, 2026
f6ebfb1
test(infra): add 303 tests — total 391, 12 suites, all green
colinmxs May 22, 2026
33f1c23
fix(workflows): pass aws-region + credentials to configure-aws-creden…
colinmxs May 25, 2026
f034077
fix(workflows+scripts): align with devops steering doc config flow
colinmxs May 25, 2026
46ce9a5
fix(infra): move artifacts CloudFront distribution to BackendStack
colinmxs May 25, 2026
32efbdf
Merge upstream develop into main
colinmxs May 26, 2026
62dc363
fix(infra): use prefixed construct IDs for Platform/Backend stacks
colinmxs May 26, 2026
870c05c
Merge upstream develop into main
colinmxs May 26, 2026
992ad0f
fix(teardown): isolate parallel cdk destroys to avoid synth lock race
colinmxs May 26, 2026
b23b123
fix(teardown): use aws cloudformation delete-stack instead of cdk des…
colinmxs May 26, 2026
13ef29f
docs(steering): make dev container resource caps mandatory
colinmxs May 26, 2026
5160241
fix(infra): wire same-stack PlatformStack refs directly instead of vi…
colinmxs May 26, 2026
05a2e0d
fix(infra): wire same-stack BackendStack refs directly instead of via…
colinmxs May 26, 2026
e525283
feat(build): content-hash-aware Docker build pipeline
colinmxs May 26, 2026
6bac02b
style(build): align new build scripts with repo conventions
colinmxs May 26, 2026
1eb9bd4
refactor(infra): decommission 'enabled' flags and dead image-tag plum…
colinmxs May 26, 2026
05107e0
refactor(infra): purge unused config fields and dead interfaces
colinmxs May 26, 2026
6055816
feat(build): per-image jobs + tighten rag-ingestion content hash
colinmxs May 26, 2026
e05e4ed
ci: gate every deploy on its relevant test suites
colinmxs May 26, 2026
d7833e6
fix(infra): finish same-stack SSM purge + add --exclusively to deploy
colinmxs May 26, 2026
7ba3584
ci: gate every backend docker build on test-backend passing
colinmxs May 26, 2026
f38890b
test(infra): fix nightly chain assertion for new array-form needs
colinmxs May 26, 2026
0870524
fix(supply-chain): satisfy every property test in tests/supply_chain/
colinmxs May 26, 2026
c977e04
fix(infra): decommission the dead AssistantsTableConstruct
colinmxs May 27, 2026
c1cc4cf
fix(infra): drop deterministic log group + function names
colinmxs May 27, 2026
51170cd
refactor(infra): hoist AgentCore Memory + CI + Browser to PlatformStack
colinmxs May 27, 2026
0b1b806
refactor(infra): hoist AgentCore Gateway construct to PlatformStack
colinmxs May 27, 2026
f43f611
refactor(infra): hoist artifact render Lambda + distribution to Platf…
colinmxs May 27, 2026
8a5841f
refactor(infra): hoist RAG ingestion Lambda to PlatformStack with boo…
colinmxs May 27, 2026
f9923be
refactor(infra): collapse the two stacks into one
colinmxs May 27, 2026
5b3bd15
docs: rewrite devops steering doc + add cutover guide
colinmxs May 27, 2026
261aee3
refactor(infra): hoist AgentCore Runtime to bootstrap-container pattern
colinmxs May 27, 2026
df2127c
refactor(infra): hoist App API ECS to bootstrap-container pattern (Ph…
colinmxs May 27, 2026
0857b46
fix(build): rm empty mktemp file before zip writes to it
colinmxs May 27, 2026
aa2ecb5
fix(restore): five bugs found by first live-fire dry-run
colinmxs May 27, 2026
a227281
fix(restore): unwrap cognito wrapped-list JSON files
colinmxs May 27, 2026
3b0f531
fix(deploy): backslash escapes inside f-string broke task-def mutator
colinmxs May 27, 2026
6c2d5e0
fix(deploy): pass full config to update-agent-runtime, not just artifact
colinmxs May 27, 2026
1efda7f
ci: bump astral-sh/setup-uv from v6.8.0 to v8.1.0
colinmxs May 27, 2026
4b124ca
fix(ci): let frontend build skip CDK_AWS_ACCOUNT validation
colinmxs May 27, 2026
d9535d3
fix(frontend): sync from dist/ai.client/browser/ where index.html lives
colinmxs May 27, 2026
b6ebaa7
fix(restore): drop Cognito-immutable attrs before AdminCreateUser
colinmxs May 27, 2026
7e5ad1d
feat(restore): cross-pool Cognito sub remapping for DDB + S3
colinmxs May 27, 2026
e360b62
docs: update README, ACTIONS readme, and infrastructure README for si…
colinmxs May 27, 2026
850e045
docs: add upgrade guide from multi-stack to single-stack architecture
colinmxs May 28, 2026
554698e
docs: update all steering docs + claude skills for single-stack archi…
colinmxs May 28, 2026
90ad955
ci: temporarily disable auto-push triggers on backend/platform/frontend
colinmxs May 28, 2026
d9f3b30
test+fix: backup/restore coverage verification (reflection test)
colinmxs May 28, 2026
7c4ec93
fix(restore): transition restored users to CONFIRMED state
colinmxs May 28, 2026
0494892
feat(restore): replay AgentCore Memory events with sub remap
colinmxs May 29, 2026
aee0b75
fix(iam): use real AgentCore Memory action names in role policies
colinmxs May 29, 2026
638f851
fix(iam): replace InvokeBrowser + non-custom resource ARNs
colinmxs May 29, 2026
08de06e
refactor(infra): kill same-stack SSM publish-then-read deadlock
colinmxs May 29, 2026
d479081
fix: stop tracking manifest.json scratch file
colinmxs May 29, 2026
5e14563
fix(env): correct App API container env var mapping mismatches
colinmxs May 29, 2026
7e5d004
test(supply-chain): assert CDK env vars match what Python reads
colinmxs May 29, 2026
11513cd
refactor(infra): inline SPA distribution; strip refactor-history comm…
colinmxs May 29, 2026
4bbff4f
test+ci: synth-time SSM safety check + composite build action
colinmxs May 29, 2026
92c87e0
fix(iam): restore OAuth/connector grants dropped during decomposition
colinmxs May 29, 2026
435bab3
fix(infra): SSM-resolved compute image URIs to close bootstrap-revert…
colinmxs May 29, 2026
9932805
review: docs refresh + restored security tests + 5 cleanups
colinmxs May 29, 2026
c6aaa2f
Merge upstream/develop into feature/stack-architecture-simplification
colinmxs Jun 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 44 additions & 44 deletions .claude/skills/cdk-infrastructure/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: cdk-infrastructure
description: AWS CDK infrastructure development with TypeScript. Use when creating or modifying CDK stacks, constructs, DynamoDB tables, ECS/Fargate services, Lambda functions, S3 buckets, networking, IAM roles, or any CloudFormation resources. Covers configuration patterns, cross-stack references via SSM, naming conventions, and Bedrock AgentCore integration.
description: AWS CDK infrastructure development with TypeScript. Use when creating or modifying CDK constructs, DynamoDB tables, ECS/Fargate services, Lambda functions, S3 buckets, networking, IAM roles, or any CloudFormation resources. Covers configuration patterns, single-stack architecture, naming conventions, and Bedrock AgentCore integration.
---

# AWS CDK Infrastructure Best Practices
Expand All @@ -11,41 +11,47 @@ description: AWS CDK infrastructure development with TypeScript. Use when creati
- Import from `aws-cdk-lib` and `constructs`
- Use L2 constructs when available, L1 (Cfn*) when necessary

## Stack Organization
## Architecture — Single Stack

The entire application is provisioned by **one CDK stack** (`PlatformStack`). Application code is shipped out-of-band via AWS APIs (ECR push → ECS service update / Lambda code update / AgentCore Runtime update).

```
infrastructure/
├── bin/infrastructure.ts # App entrypoint
├── bin/infrastructure.ts # App entrypoint (instantiates PlatformStack)
├── lib/
│ ├── config.ts # Configuration loader
│ ├── infrastructure-stack.ts # Network resources (deploy first)
│ ├── app-api-stack.ts # Backend services
│ └── my-new-stack.ts # New stacks go here
└── cdk.context.json # Configuration
│ ├── platform-stack.ts # The one stack — all infrastructure
│ ├── config.ts # Configuration loader & validator
│ └── constructs/ # 39 reusable CDK constructs
│ ├── network/ # VPC, ALB, ECS cluster
│ ├── identity/ # Cognito, secrets, KMS, OAuth
│ ├── data/ # DynamoDB tables, file uploads
│ ├── rag/ # RAG documents, vectors
│ ├── rag-ingestion/ # RAG ingestion Lambda
│ ├── artifacts/ # Artifact rendering pipeline
│ ├── mcp-sandbox/ # MCP Apps sandbox proxy
│ ├── agentcore/ # Memory, Code Interpreter, Browser, Gateway
│ ├── inference-api/ # AgentCore Runtime
│ ├── app-api/ # Fargate service
│ ├── fine-tuning/ # SageMaker IAM
│ ├── spa/ # SPA CloudFront distribution
│ └── zones/ # Route53, ALB DNS
└── cdk.context.json # Configuration defaults
```

**Deployment Order:**
1. `InfrastructureStack` - VPC, ALB, ECS Cluster (always first)
2. Other stacks import network resources via SSM
**Key principle:** CDK deploys are rare (infrastructure changes only). Day-to-day code changes deploy via `backend.yml` (AWS API calls, no CDK).

## Configuration

Use the centralized config system:

```typescript
import { loadConfig, getResourceName, getStackEnv, applyStandardTags } from './config';
```

export class MyStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
const config = loadConfig(scope);
super(scope, id, {
...props,
env: getStackEnv(config),
stackName: getResourceName(config, 'my-stack'),
});
applyStandardTags(this, config);
}
}
PlatformStack receives config via props:
```typescript
const config = loadConfig(app);
new PlatformStack(app, `${config.projectPrefix}-PlatformStack`, { config, env });
```

For configuration patterns, see [references/configuration.md](references/configuration.md).
Expand All @@ -57,31 +63,25 @@ For configuration patterns, see [references/configuration.md](references/configu
getResourceName(config, 'user-quotas') // "bsu-agentcore-user-quotas"
```

**SSM Parameters:** Hierarchical naming:
**SSM Parameters:** Hierarchical naming for runtime consumption:
```
/{projectPrefix}/{category}/{resource-type}
```

Categories: `/network/`, `/quota/`, `/cost-tracking/`, `/auth/`, `/frontend/`, `/gateway/`
Categories: `/network/`, `/quota/`, `/cost-tracking/`, `/auth/`, `/frontend/`, `/gateway/`, `/rag/`, `/artifacts/`

## Cross-Stack References
## Cross-Construct References

**Export:**
```typescript
new ssm.StringParameter(this, 'VpcIdParam', {
parameterName: `/${config.projectPrefix}/network/vpc-id`,
stringValue: vpc.vpcId,
});
```
Since everything is in one stack, use **typed props** — not SSM:

**Import:**
```typescript
const vpcId = ssm.StringParameter.valueForStringParameter(
this,
`/${config.projectPrefix}/network/vpc-id`
);
// In PlatformStack:
const network = new NetworkConstruct(this, 'Network', { config });
new AlbConstruct(this, 'Alb', { config, vpc: network.vpc });
```

SSM parameters are published **only for runtime consumption** by ECS tasks and Lambdas — never for CDK-to-CDK references within the same stack.

## DynamoDB Tables

- Always use PK + SK for flexibility
Expand All @@ -93,10 +93,11 @@ For table patterns, see [references/dynamodb.md](references/dynamodb.md).

## ECS/Fargate

- Import cluster from SSM
- Cluster created by NetworkConstruct, referenced via typed prop
- Health checks mandatory
- Auto-scaling with CPU/memory targets
- Circuit breaker for rollback
- Bootstrap container pattern: CDK creates the service with a placeholder image; the backend workflow pushes the real image via `update-service`

For service patterns, see [references/ecs-fargate.md](references/ecs-fargate.md).

Expand All @@ -105,6 +106,7 @@ For service patterns, see [references/ecs-fargate.md](references/ecs-fargate.md)
- Use ARM64 architecture (cost optimization)
- Role with least privilege
- Secrets Manager access requires wildcard suffix
- Bootstrap pattern: CDK creates the function with placeholder code; the backend workflow pushes real code via `update-function-code`

For Lambda patterns, see [references/lambda.md](references/lambda.md).

Expand Down Expand Up @@ -138,19 +140,17 @@ name: getResourceName(config, 'memory').replace(/-/g, '_')
resources: [`${secret.secretArn}*`]
```

**Environment Removal Policy:**
**Removal Policy:**
```typescript
removalPolicy: config.environment === 'prod'
? cdk.RemovalPolicy.RETAIN
: cdk.RemovalPolicy.DESTROY
removalPolicy: getRemovalPolicy(config) // RETAIN in prod, DESTROY in dev
```

## CDK Commands

```bash
cd infrastructure
npm install # Install dependencies
npm ci # Install dependencies
npx cdk synth # Synthesize CloudFormation
npx cdk deploy --all # Deploy all stacks
npx cdk deploy {prefix}-PlatformStack # Deploy
npx cdk diff # Preview changes
```
80 changes: 72 additions & 8 deletions .github/README-ACTIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,60 @@ Deploy a production-ready multi-agent AI platform to your AWS account in about 4
>
> ### 👉 [Start here — Step 1: Prerequisites](./docs/deploy/step-01-prerequisites.md)

## Architecture Overview

The platform uses a **single-stack architecture**:

- **One CDK stack** (`PlatformStack`) provisions all AWS infrastructure — VPC, ALB, DynamoDB, S3, Cognito, CloudFront, AgentCore, ECS, Lambdas.
- **Application code ships out-of-band** via AWS APIs — not via CDK deploys. This means infrastructure changes (rare) and code changes (frequent) are completely decoupled.

## What You'll Deploy

### Infrastructure (via `platform.yml`)

| Component | Description |
|-----------|-------------|
| **VPC + ALB + ECS** | Networking, load balancer, and container orchestration |
| **Fine-Tuning** *(optional)* | SageMaker training/inference infrastructure, S3 artifact storage, DynamoDB job tracking |
| **Artifacts** *(optional)* | Iframe-isolated artifact rendering (DDB metadata, S3 content, CloudFront at `artifacts.{domain}`, Lambda render service) |
| **RAG Ingestion** | Document ingestion pipeline for retrieval-augmented generation |
| **Inference API** | Strands Agent runtime powered by AWS Bedrock AgentCore |
| **App API** | Backend REST API for chat, sessions, admin, and auth |
| **Frontend** | Angular SPA served via CloudFront CDN |
| **Gateway** | Lambda-based MCP tool endpoints (Wikipedia, ArXiv, Finance, etc.) |
| **Bootstrap Data** | Auth provider config, default models, roles, and tools |
| **DynamoDB** | ~24 tables for all application state |
| **S3** | 6 buckets (file uploads, RAG, artifacts, SPA, MCP sandbox, fine-tuning) |
| **Cognito** | User pool, identity providers, BFF app client |
| **CloudFront** | SPA distribution + artifacts iframe origin + MCP sandbox proxy |
| **AgentCore** | Memory, Code Interpreter, Browser, Gateway, Runtime |
| **SageMaker IAM** | Execution role for fine-tuning jobs |

### Application Code (via `backend.yml`)

| Service | Deploy Method |
|---------|--------------|
| **App API** | ECR push → ECS service update |
| **Inference API** | ECR push → AgentCore Runtime update |
| **RAG Ingestion** | ECR push → Lambda update-function-code |
| **Artifact Render** | Zip → Lambda update-function-code |

### Frontend (via `frontend-deploy.yml`)

| Component | Deploy Method |
|-----------|--------------|
| **Angular SPA** | S3 sync + CloudFront invalidation |

### Bootstrap Data (via `bootstrap-data-seeding.yml`)

| Component | Description |
|-----------|-------------|
| **Seed data** | Auth provider config, default models, roles, and tools |

---

## Workflows

| Workflow | Trigger | What it does |
|---------|---------|--------------|
| `platform.yml` | Infra code changes / manual | `cdk deploy` — provisions or updates all AWS resources |
| `backend.yml` | Backend code changes / manual | Builds Docker images (content-hash skip), pushes to ECR, updates ECS/Lambda/Runtime |
| `frontend-deploy.yml` | Frontend code changes / manual | Builds Angular SPA, syncs to S3, invalidates CloudFront |
| `bootstrap-data-seeding.yml` | Manual | Seeds DynamoDB with default config (first deploy only) |
| `teardown.yml` | Manual | Destroys all CDK stacks (for cleanup) |
| `nightly-deploy-pipeline.yml` | Nightly / manual | Full end-to-end: platform → backend → frontend |

---

Expand All @@ -39,6 +80,29 @@ Follow each step in order. Click a step to open its guide.

---

## Deploy Order (First Time)

```
1. platform.yml → provisions all AWS infrastructure (~15 min)
2. backend.yml → builds + deploys all container images (~5 min)
3. frontend-deploy.yml → builds + deploys the Angular SPA (~2 min)
4. bootstrap-data-seeding.yml → seeds default config data (~1 min)
```

After the first deploy, `platform.yml` only needs to run when infrastructure changes. Day-to-day pushes trigger `backend.yml` and/or `frontend-deploy.yml` automatically.

---

## Content-Hash Docker Builds

The `backend.yml` workflow uses **content-hash tagging** — each image is tagged with a SHA-256 hash of its source inputs (Dockerfile + source tree + dependency manifests). If ECR already has an image with that tag, the build is skipped entirely. This means:

- Pushing a frontend-only change doesn't rebuild any Docker images
- Pushing a change to one service only rebuilds that service's image
- Unchanged services deploy in seconds (just verifies the tag exists)

---

## Quick Links

- [Troubleshooting](./docs/deploy/troubleshooting.md) — common issues and fixes
Expand Down
52 changes: 52 additions & 0 deletions .github/actions/build-and-push-image/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: 'Build and push ECR image'
description: |
Run scripts/build/build-one.sh for one of the project's images,
with conditional QEMU + buildx setup for arm64 builds (AgentCore
Runtime). Wraps the configure-aws-credentials composite + the
build script call so per-image jobs in backend.yml stay short.
inputs:
image-name:
description: 'Image to build (app-api, inference-api, rag-ingestion)'
required: true
platform:
description: 'Build platform (linux/amd64 or linux/arm64). arm64 sets up QEMU + buildx.'
required: false
default: 'linux/amd64'
aws-region:
description: 'AWS region for ECR push'
required: true
aws-role-arn:
description: 'AWS role ARN for OIDC auth (preferred)'
required: false
aws-access-key-id:
description: 'AWS access key ID fallback'
required: false
aws-secret-access-key:
description: 'AWS secret access key fallback'
required: false
outputs:
image_tag:
description: 'Content-hash image tag pushed to ECR'
value: ${{ steps.build.outputs.image_tag }}
runs:
using: 'composite'
steps:
- uses: ./.github/actions/configure-aws-credentials
with:
aws-region: ${{ inputs.aws-region }}
aws-role-arn: ${{ inputs.aws-role-arn }}
aws-access-key-id: ${{ inputs.aws-access-key-id }}
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
# Inference API runs on AgentCore Runtime (linux/arm64). On the
# amd64 GitHub-hosted runner we need QEMU + buildx emulation so
# `docker build --platform linux/arm64` works.
- if: inputs.platform == 'linux/arm64'
uses: docker/setup-qemu-action@ce360397dd3f832beb865e1373c09c0e9f86d70a # v4.0.0
with:
platforms: arm64
- if: inputs.platform == 'linux/arm64'
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
- name: Build & push ${{ inputs.image-name }}
id: build
shell: bash
run: bash scripts/build/build-one.sh ${{ inputs.image-name }}
13 changes: 7 additions & 6 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,17 +28,18 @@ npx eslint src/ && npx prettier --check src/

### Infrastructure (`cd infrastructure`)
```bash
npm ci && npm run build
npx cdk synth # validates stacks
npx cdk deploy --all
npm test -- test/stack-dependencies.test.ts # verifies new stacks are registered
npm ci && npx tsc --noEmit
npx jest # CDK construct + stack tests
npx cdk synth # validates the stack
npx cdk deploy {prefix}-PlatformStack # deploy the single stack
```

## Architecture — the big picture

- **Three independent backend consumers** of `apis.shared`: `app_api`, `inference_api`, and `agents/`. They must **never import from each other** — only from `apis.shared`. Enforced by `backend/tests/architecture/test_import_boundaries.py`.
- **Inference API runs inside an AgentCore Runtime container.** The runtime data plane only proxies `POST /invocations` and `GET /ping` — any other route returns 404 in cloud (works locally because `localhost:8001` bypasses the gateway). User-facing CRUD endpoints **belong in app-api**, not inference-api. To get workload context on app-api, use the `AGENTCORE_RUNTIME_WORKLOAD_NAME` mint fallback in `apis/shared/oauth/agentcore_identity.py`.
- **Deploy order** (cross-stack SSM references): Infrastructure → (Gateway, RAG Ingestion, SageMaker Fine-Tuning, Artifacts, MCP Sandbox in parallel) → Inference API → App API → Frontend. App API reads `runtime-workload-identity-name` from SSM, published by Inference API.
- **Single CDK stack.** All AWS resources live in one `PlatformStack` (`infrastructure/lib/platform-stack.ts`). No cross-stack SSM references between CDK stacks; values that flow between constructs go through the typed `PlatformComputeRefs` interface.
- **Deploy order:** `platform.yml` (CDK, only when infra changes) → `backend.yml` (per-image build + AWS-API code deploy: app-api, inference-api, rag-ingestion, artifact-render in parallel) → `frontend-deploy.yml` (S3 sync + CloudFront invalidation). Day-to-day code changes only re-run `backend.yml`. Compute image URIs are read from SSM at CFN deploy time (`/{prefix}/app-api/image-tag`, `/{prefix}/inference-api/image-tag`) so any task-def or Runtime property change picks up whatever image the build pipeline most recently pushed.
- **Errors stream as assistant messages over SSE**, not HTTP error codes. See SSE event table in `CLAUDE.MD` (`message_start`, `content_block_*`, `tool_use`/`tool_result`, `ui_resource`, `stream_error`, `oauth_required`, `compaction`, `done`).
- **Multi-protocol tools:** direct/AWS-SDK tools live in `agents/main_agent/tools/`; remote tools come via MCP+SigV4 (Gateway Lambda) or A2A (Runtime). A2A is currently **client-only**; if exposing an A2A server, `capabilities` must include `streaming=True` or clients hang.
- **Frontend is signal-based** throughout (`signal()`, `computed()`). API shapes are defined by backend routes; matching TS interfaces must be updated in the same PR as breaking backend changes.
Expand All @@ -63,7 +64,7 @@ npm test -- test/stack-dependencies.test.ts # verifies new stacks are register
| Shared backend code | `backend/src/apis/shared/<domain>/` |
| Lambda for an infra stack | `backend/src/lambdas/<lambda-name>/` (not part of `apis/` boundary) |
| Angular page | `frontend/ai.client/src/app/<feature>/` |
| New CDK stack | `infrastructure/lib/<stack-name>-stack.ts` — also register in `test/stack-dependencies.test.ts` with a tier, add `scripts/stack-<name>/`, add a workflow under `.github/workflows/`, update `.github/docs/deploy/step-04-deploy.md` |
| New CDK construct | `infrastructure/lib/constructs/<area>/<name>-construct.ts` — compose into `PlatformStack` (`lib/platform-stack.ts`); if it exposes values to compute constructs, thread them through `PlatformComputeRefs` rather than SSM. There are no separate CDK *stacks* anymore. |

## Debugging cheatsheet

Expand Down
Loading