Skip to content

ci: add test workflow for PR gate#22

Closed
Mingholy wants to merge 589 commits into
mainfrom
ci/add-test-workflow
Closed

ci: add test workflow for PR gate#22
Mingholy wants to merge 589 commits into
mainfrom
ci/add-test-workflow

Conversation

@Mingholy

@Mingholy Mingholy commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Summary

  • 新增 .github/workflows/ci.yml,在 PR 提交到 main 时自动运行 server 单元测试
  • 使用 matrix strategy 覆盖 ubuntu-latest + macos-latest 双平台
  • fail-fast: false 确保单平台失败不阻断另一平台结果

权限门禁(需手动配置)

Workflow 文件就绪后,需要在 repo Settings 中配置:

Settings → Actions → General → Fork pull request workflows from outside collaborators
→ 选择 "Require approval for all outside collaborators"

效果:

  • Collaborator 提 PR → CI 自动运行
  • 外部 fork PR → 等待 collaborator 在 Actions tab 点击 "Approve and run"

Test plan

  • PR 创建后观察 Actions 是否自动触发双平台测试
  • 确认 server 单测在 CI 环境中的通过情况
  • 后续从外部 fork 提 PR 验证审批拦截

🤖 Generated with Claude Code

panlilu and others added 30 commits May 25, 2026 08:23
…, and clear goals; enhance UI to display active goals
…r; implement server management and logging features
…rsive and listFilesTree methods for improved file management
MCP tokens were stored under a derived `MCP_<NAME>_TOKEN` convention,
surfaced via a dedicated Settings page panel, and discovered through
plugin `.mcp.json` walking. After this change, MCP tokens are
indistinguishable from any other vault env: server config goes in
tier `settings.json` mcpServers (single source = loop's merged
settings.json), and the env name is parsed directly from the server's
`Authorization: Bearer ${VAR}` header.

- Drop `mcpServerEnvVarName` / `.mcp.json` reads; add strict
  `parseBearerEnvName` that requires `^Bearer ${VAR}$` (case-insensitive
  Bearer, uppercase env ref, single ref only)
- `/api/mcp-servers` returns a flat list from merged settings.json with
  `authTokenEnv` + `authed` (existence-only check, no validity probe)
- Replace `GET /api/mcp-auth` + `DELETE /api/mcp-auth/:server` with the
  generic `DELETE /api/envs/:name`
- Compose fills `mcpServers` defaults from each enabled plugin's own
  `settings.json` (plugins are lowest priority — team/profile/personal
  always win on same key)
- Drop Settings → MCP tab; popover (/mcp) becomes the only entry point,
  with auth / re-auth / forget actions
Pairs with production_start.sh / start.sh. Targets the dev/prod server
tree (bun run dev → --hot index.ts + vite, or production_start.sh), plus
sandboxed bwrap children. Conservative matching: cmd-line patterns
anchored to the repo / LOOPAT_HOME, /proc/*/cwd inside the repo
filtered to {bun,node,vite}. Excludes `claude` so the user's
interactive CLI session in this dir survives.

SIGTERM first, escalates to SIGKILL after 7.5s. --dry-run previews.
Co-Authored-By: xubai2537 <xubai2537@gmail.com>
Co-Authored-By: xubai2537 <xubai2537@gmail.com>
- Add per-user gateway tokens (gateway-tokens.ts) replacing shared env-var auth
- Rework external-gateway.ts: remove recentTurns, persist metadata/traceId to
  loop meta, add mutex lock for thread-loop mapping, use session internal queue
  instead of 409 rejection, authenticate via per-user tokens
- Add gateway token management API (GET/POST/DELETE /api/gateway-tokens)
- Add Gateway Tokens tab in Settings UI for token CRUD
- Add lastExternalMeta to LoopMeta type
- Remove all LOOPAT_GATEWAY_* env-var references from loopat codebase
- Rename mock env to LOOPAT_RUNTIME_MOCK

Co-Authored-By: xubai2537 <xubai2537@gmail.com>
Adds /api/v1/loops/* (CRUD + SSE messages + watch events + choices +
interrupt) and /api/v1/me/tokens for hashed-at-rest API tokens with
stable tokenId. Same endpoints accept either session cookie (web) or
Bearer token (bots) — see docs/api-v1.md.

Web NewLoopDialog now POSTs /api/v1/loops; Settings → API tokens uses
v1 too. Chat streaming (useLoopRuntime / WS) is deliberately deferred.

Removes the older /api/runtime/v1/turn/stream + /api/gateway-tokens
endpoints (cherry-picked from internal CR) — superseded by v1.

Tests:
- 31 new bun integration tests (auth, CRUD, isolation, validation)
- 1 new Playwright e2e: create-loop dialog hits /api/v1/loops
- All existing 246 server tests + 9 e2e remain green
…r features)

- useLoopRuntime opens v1 SSE GET /events as a parallel live channel and
  dedupes vs WS by SDK uuid. Send paths (user message, interrupt, choice
  answer) now hit /api/v1/messages / interrupt / choices/{id} instead of
  ws.send. Operator features (queue, goal, provider, thinking, clear)
  stay on WS — they're not chat semantics, per spec.
- v1 POST /messages accepts permission_mode for parity with the web's
  PlanMode toggle.
- v1 SSE emits sdk_message pass-through of raw SDK messages, marked in
  spec as web-internal / unstable shape; stable bot-facing events
  (assistant_delta, requires_choice, etc.) unchanged.
- e2e: new test exercises full v1 chain — opens loop, confirms
  /events SSE subscription fires, sends a message, confirms POST hits
  /api/v1/loops/{id}/messages with content + permission_mode.
The spec body now documents the cookie-only /me/tokens endpoints
(token create/list/revoke) that actually exist in the implementation.

New "现状 (Implementation Status, 2026-05-26)" section captures:
  - Endpoint × (spec / impl / test / web-wired) status table
  - Web hybrid transport diagram (WS read + SSE read with uuid dedupe;
    v1 POST for chat writes; ws.send for operator features)
  - List of non-v1 endpoints still active and who uses them
  - Known gaps with prioritised TODO (decouple WS read, REST-ify
    operator features, add files attachment to v1, idempotency
    persistence, etc.)
  - Key files + line counts
  - Conventions for future edits (spec-first, snake_case at boundary,
    metadata invisible to sandbox)
…eway Tokens → API Tokens

- New `server/src/api-v1-openapi.ts` — hand-written OpenAPI 3.1 schema
  mirroring docs/api-v1.md (loops CRUD, messages SSE, events SSE,
  choices, interrupt, /me/tokens). Source of truth stays the markdown;
  this is the machine-readable mirror.
- GET /api/v1/openapi.json — serves the spec
- GET /api/v1/docs            — Scalar interactive reference
- Both public (no auth) so doc URLs are shareable

Settings UI:
- Tab id "gateway-tokens" → "api-tokens"
- Label "Gateway Tokens" → "API Tokens"
- Description updated to point at the Loop API
- New "Loop API documentation" card above the token list with an
  "Open docs →" link to /api/v1/docs

Dep: @scalar/hono-api-reference@0.10.19
…estion, resolved)

Settles 5 rounds of "what is an agent" discussion into a single model:
one human can own multiple accounts. Schema delta is two fields
(ownerId, chatBotMode). Personal accounts and owned accounts are
structurally identical — same vault / .claude / memory / loops machinery.
"Agent" stays as a UI-facing label only; the data model has no separate
agent type.

Captures: BUC analogue framing, behavioural differences (login / token
issuance / loop ownership), UI grouping rules, chat bot mode hook,
v1 API surface deltas, implementation sequence (7 ordered steps),
historical iteration log (why each prior iteration didn't work), FAQ.
…rminology

Replace internal "持有账号" naming with "公共账号" (public account), the
standard term in OA systems (DingTalk, WeCom, Lark, BUC, AWS IAM, GCP
service accounts, GitHub bot accounts). All these systems solve the same
problem — one human manages multiple accounts with different purposes —
and none of them invent a new "non-human entity" type. Loopat aligns.

- Intro table compares 7 OA systems' multi-account patterns
- 公共账号 = "non-personal purpose account with a designated owner"
  (clarifies it's not multi-owner; nuance vs OA convention captured in FAQ)
- "Agent" framed as a per-context alias of 公共账号 (like DingTalk calls
  them 机器人, WeCom calls them 应用账号, AWS calls them Service Accounts)
- UI guidance maps to OA admin-panel conventions (员工只看自己 + 自管的;
  HR/admin 看全员)
- New FAQ entry explaining why no "sub-" prefix — resources are
  structurally identical, "sub" suggests downgrade which is wrong
Per docs/account-model.md (OA-system "multi-account / 公共账号" model):

Schema:
- User adds optional ownerId field. null = personal account (existing
  behavior, password login). Non-null = public account (token-only,
  cannot login, cannot nest, cannot self-issue tokens).
- createUser accepts ownerId. Owned accounts: no password, role=member,
  status=active immediately, no first-admin bootstrap.
- POST /api/auth/login explicitly rejects owned accounts (same generic
  401 to avoid leaking existence).

API surface (v1):
- POST /api/v1/me/accounts { id }         — human creates public account
- GET  /api/v1/me/accounts                — list accounts caller owns
- DELETE /api/v1/me/accounts/{id}         — hard delete + cascade tokens
- POST /api/v1/me/tokens body adds optional forAccount: issue token for
  a public account caller owns. token resolves to that account.
- GET  /api/v1/me/tokens ?forAccount=...  — list a public account's tokens
- DELETE /api/v1/me/tokens/{tokenId}      — searches across owned accounts

OpenAPI spec + spec doc updated with the new endpoints.

Tests: 15 new in api-v1.test.ts covering create/list/delete + nesting
guard + forAccount happy + 403/404 paths + login-rejects-owned. Full
suite 292 pass / 1 skip.

Web Settings UI (step 6) + e2e (step 7 chain) deferred to next commit.
UI for the public-account model. Lives in Settings as a new "Agents" tab
(marketing label) backed by the account data model:
- Lists ownerId=me accounts via GET /api/v1/me/accounts
- Create form (single-field id input) POSTs /api/v1/me/accounts
- Per-row delete confirm → DELETE /api/v1/me/accounts/{id}
- Pointer to API docs for the next step (issue tokens)

web/src/api.ts: listMyAccounts / createMyAccount / deleteMyAccount + the
PublicAccount type. listApiTokens / createApiToken now accept optional
forAccount (wired for future cross-tab token UI).

e2e: new test verifies the create flow POSTs /api/v1/me/accounts and
the agent shows up in the table.

Terminology lock-in (per discussion): "account" is the canonical system
concept. "user" only refers to the human in conversational text. "Agent"
is the UI/marketing label for ownerId-non-null accounts. Code's `User`
type is historical, kept as-is.
Per the agreed terminology: "account" is the only system concept;
"agent" doesn't appear in the system, UI, code, or docs. Was sloppy
to leak "Agents" into the Settings tab label.

- Tab id: agents → accounts; URL /settings/agents → /settings/accounts
- Tab label: "Agents" → "Accounts"
- Description: drops "agent" phrasing, says "additional accounts you own"
- Section heading: "Your agents" → "Your accounts"
- Empty state, create form heading, just-created banner, delete tooltip
  all switch to "account" language
- e2e test renamed and selectors updated
simpx and others added 26 commits June 3, 2026 01:42
…ner recreate

Regresses the bug where ensureContainer's 'config hash drift — recreating'
path tears down + recreates a loop's sandbox container (SIGKILL 137 on
in-flight exec'd processes) when hashCreateArgs drifts between attach/detach.

Flow (no chat → zero AI tokens): create loop from roster1, open terminal to
trigger ensureContainer, record the running container's Id/StartedAt/CreatedAt
via podman inspect, then detach (back to /loop list, closing the term socket)
and re-attach (reopen terminal → ensureContainer again), twice. Integration
truth: SAME container Id and UNCHANGED StartedAt/CreatedAt across every cycle
prove no drift, no teardown+recreate.

No product bug found — hashCreateArgs is already stable across attach/detach
(extraEnv excluded from the hash; existence-conditional binds stable in the
no-chat flow). Test is GREEN.
…se cleanup; full bun run dogfood green

Each failing spec derived loopId from the browser URL, which with loops
accumulating in the shared LOOPAT_HOME read a STALE loop's id and made the
container assertions check the wrong loop. Switch context-notes-sync,
first-5-minutes, and multi-turn-task to read the id from the POST
/api/v1/loops create RESPONSE (strip loop_ prefix), exactly like
second-loop-warm/attach-detach. Add a best-effort test.afterEach in each
loop-creating spec that podman rm -f's the loop's sandbox container by
loopat.loop-id label so loops/containers don't accumulate between cases.

Product fix (genuine bug surfaced by the shared run): ensureContextWorktree
ran 'git worktree add -B <branch> <path>' with no start point for an empty
context remote (notes.git has no commits). The per-user clone's HEAD is an
unborn ref (main); once the repo gains an unrelated branch (master, pushed by
an earlier loop) git stops inferring --orphan and dies with 'fatal: invalid
reference: HEAD', failing every subsequent loop create. Pass --orphan
explicitly when there is no start point so a startless worktree never depends
on a resolvable HEAD.

Full 'bun run dogfood' now 6 passed.
…er + v1 loop-create onboarding gate fix

Rebuilds dogfood's first-run as the true first-time-user cold start (per
docs/superpowers/specs/2026-06-03-first-run-journey-redesign.md): empty
LOOPAT_HOME, fixture git-host provider (mirrors code.ts against the fixture
sshd, all endpoints via env), real browser register/login, onboarding gate,
personal-repo setup with real git-crypt, ssh-pubkey seed, context clone, loop,
real AI turn, terminal git.

Product fix: POST /api/v1/loops (the UI's loop-create path) bypassed the
onboarding gate that legacy POST /api/loops enforced — added the same gate.

seed.sh: give the notes bare repo an initial commit so the onboarding
ssh-access probe's `git ls-remote --exit-code HEAD` succeeds on access.
…first ssh attempt now reliable; drop retry mask

The intermittent 'Permission denied (publickey)' on a sandbox's FIRST git
push/clone/ls-remote was masked by a retry loop in first-5-minutes and was
conflated with scary boot-time log noise.

Root cause (perms timing): the vault .ssh is git-crypt-decrypted /
git-checked-out, so the private key lands at the umask default (0664) and the
dir at 0775; git can't carry 0600. loops.ts chmods the key 0600 only 'at point
of use' when a HOST git op goes through sshCommandForUser — but a container
created/recreated on a path that didn't (server restart -> first attach,
config/image-drift recreate) bind-mounts a stale-perms key, so the first
in-sandbox ssh can be ignored/rejected; a later host op re-chmods it and the
retry succeeds.

Fix: ensureSandboxSshPerms() chmods the vault .ssh dir 0700, private keys +
config 0600 in ensureContainer() — the one chokepoint every container start
passes through — so perms are correct the instant the bind goes live,
regardless of checkout path. Cheap, idempotent, best-effort.

Also: ensureContextRepo's WORKSPACE-DEFAULT clone uses the host's bare ssh
(no vault key), so a private ssh:// URL always failed 'Permission denied
(publickey)' at boot and fell back to a local origin (harmless, but the raw
error read as a loop-auth bug). Demoted to an info log that names it as the
benign bootstrap mirror.

Verified: dropped the retry mask from first-5-minutes; full 'bun run dogfood'
(6 cases) and 'bun run dogfood:first-run' (REAL git-crypt) both green, with
zero 'Permission denied (publickey)' in the logs (was 4/boot).
S6: held-back recovery — product exposes only save/behind/refresh, no discard/force. take-remote converges B's view; keep-mine impossible (sticky commit, always 409); SoT stays clean. S7: deletion converges A->origin->B like an edit.
Run server unit tests on both ubuntu and macOS via matrix strategy
when PRs target main. Fork PRs from outside collaborators require
manual approval through GitHub Actions settings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Mingholy

Mingholy commented Jun 5, 2026

Copy link
Copy Markdown
Owner Author

Final Verdict

APPROVED

Clean, minimal CI workflow with no security concerns — no secrets are referenced and no untrusted PR event data is interpolated into run steps. The matrix strategy with fail-fast: false is good practice for cross-platform coverage. Actions are pinned to major versions (v4/v2) from trusted publishers, which is acceptable. One minor optional hardening for the future: adding an explicit permissions: contents: read block to follow least-privilege principles, but this is not blocking.

@Mingholy

Mingholy commented Jun 7, 2026

Copy link
Copy Markdown
Owner Author

Closing: this PR has merge conflicts with main and cannot be merged as-is.

@Mingholy Mingholy closed this Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants