ci: add test workflow for PR gate by Mingholy · Pull Request #22 · Mingholy/loopat

Mingholy · 2026-06-05T05:12:18Z

Summary

新增 .github/workflows/ci.yml，在 PR 提交到 main 时自动运行 server 单元测试
使用 matrix strategy 覆盖 ubuntu-latest + macos-latest 双平台
fail-fast: false 确保单平台失败不阻断另一平台结果

权限门禁（需手动配置）

Workflow 文件就绪后，需要在 repo Settings 中配置：

Settings → Actions → General → Fork pull request workflows from outside collaborators
→ 选择 "Require approval for all outside collaborators"

效果：

Collaborator 提 PR → CI 自动运行
外部 fork PR → 等待 collaborator 在 Actions tab 点击 "Approve and run"

Test plan

PR 创建后观察 Actions 是否自动触发双平台测试
确认 server 单测在 CI 环境中的通过情况
后续从外部 fork 提 PR 验证审批拦截

🤖 Generated with Claude Code

…improve file selection experience

…, and clear goals; enhance UI to display active goals

…r; implement server management and logging features

…rsive and listFilesTree methods for improved file management

MCP tokens were stored under a derived `MCP_<NAME>_TOKEN` convention, surfaced via a dedicated Settings page panel, and discovered through plugin `.mcp.json` walking. After this change, MCP tokens are indistinguishable from any other vault env: server config goes in tier `settings.json` mcpServers (single source = loop's merged settings.json), and the env name is parsed directly from the server's `Authorization: Bearer ${VAR}` header. - Drop `mcpServerEnvVarName` / `.mcp.json` reads; add strict `parseBearerEnvName` that requires `^Bearer ${VAR}$` (case-insensitive Bearer, uppercase env ref, single ref only) - `/api/mcp-servers` returns a flat list from merged settings.json with `authTokenEnv` + `authed` (existence-only check, no validity probe) - Replace `GET /api/mcp-auth` + `DELETE /api/mcp-auth/:server` with the generic `DELETE /api/envs/:name` - Compose fills `mcpServers` defaults from each enabled plugin's own `settings.json` (plugins are lowest priority — team/profile/personal always win on same key) - Drop Settings → MCP tab; popover (/mcp) becomes the only entry point, with auth / re-auth / forget actions

Pairs with production_start.sh / start.sh. Targets the dev/prod server tree (bun run dev → --hot index.ts + vite, or production_start.sh), plus sandboxed bwrap children. Conservative matching: cmd-line patterns anchored to the repo / LOOPAT_HOME, /proc/*/cwd inside the repo filtered to {bun,node,vite}. Excludes `claude` so the user's interactive CLI session in this dir survives. SIGTERM first, escalates to SIGKILL after 7.5s. --dry-run previews.

Co-Authored-By: xubai2537 <xubai2537@gmail.com>

- Add per-user gateway tokens (gateway-tokens.ts) replacing shared env-var auth - Rework external-gateway.ts: remove recentTurns, persist metadata/traceId to loop meta, add mutex lock for thread-loop mapping, use session internal queue instead of 409 rejection, authenticate via per-user tokens - Add gateway token management API (GET/POST/DELETE /api/gateway-tokens) - Add Gateway Tokens tab in Settings UI for token CRUD - Add lastExternalMeta to LoopMeta type - Remove all LOOPAT_GATEWAY_* env-var references from loopat codebase - Rename mock env to LOOPAT_RUNTIME_MOCK Co-Authored-By: xubai2537 <xubai2537@gmail.com>

Adds /api/v1/loops/* (CRUD + SSE messages + watch events + choices + interrupt) and /api/v1/me/tokens for hashed-at-rest API tokens with stable tokenId. Same endpoints accept either session cookie (web) or Bearer token (bots) — see docs/api-v1.md. Web NewLoopDialog now POSTs /api/v1/loops; Settings → API tokens uses v1 too. Chat streaming (useLoopRuntime / WS) is deliberately deferred. Removes the older /api/runtime/v1/turn/stream + /api/gateway-tokens endpoints (cherry-picked from internal CR) — superseded by v1. Tests: - 31 new bun integration tests (auth, CRUD, isolation, validation) - 1 new Playwright e2e: create-loop dialog hits /api/v1/loops - All existing 246 server tests + 9 e2e remain green

…r features) - useLoopRuntime opens v1 SSE GET /events as a parallel live channel and dedupes vs WS by SDK uuid. Send paths (user message, interrupt, choice answer) now hit /api/v1/messages / interrupt / choices/{id} instead of ws.send. Operator features (queue, goal, provider, thinking, clear) stay on WS — they're not chat semantics, per spec. - v1 POST /messages accepts permission_mode for parity with the web's PlanMode toggle. - v1 SSE emits sdk_message pass-through of raw SDK messages, marked in spec as web-internal / unstable shape; stable bot-facing events (assistant_delta, requires_choice, etc.) unchanged. - e2e: new test exercises full v1 chain — opens loop, confirms /events SSE subscription fires, sends a message, confirms POST hits /api/v1/loops/{id}/messages with content + permission_mode.

The spec body now documents the cookie-only /me/tokens endpoints (token create/list/revoke) that actually exist in the implementation. New "现状 (Implementation Status, 2026-05-26)" section captures: - Endpoint × (spec / impl / test / web-wired) status table - Web hybrid transport diagram (WS read + SSE read with uuid dedupe; v1 POST for chat writes; ws.send for operator features) - List of non-v1 endpoints still active and who uses them - Known gaps with prioritised TODO (decouple WS read, REST-ify operator features, add files attachment to v1, idempotency persistence, etc.) - Key files + line counts - Conventions for future edits (spec-first, snake_case at boundary, metadata invisible to sandbox)

…eway Tokens → API Tokens - New `server/src/api-v1-openapi.ts` — hand-written OpenAPI 3.1 schema mirroring docs/api-v1.md (loops CRUD, messages SSE, events SSE, choices, interrupt, /me/tokens). Source of truth stays the markdown; this is the machine-readable mirror. - GET /api/v1/openapi.json — serves the spec - GET /api/v1/docs — Scalar interactive reference - Both public (no auth) so doc URLs are shareable Settings UI: - Tab id "gateway-tokens" → "api-tokens" - Label "Gateway Tokens" → "API Tokens" - Description updated to point at the Loop API - New "Loop API documentation" card above the token list with an "Open docs →" link to /api/v1/docs Dep: @scalar/hono-api-reference@0.10.19

…estion, resolved) Settles 5 rounds of "what is an agent" discussion into a single model: one human can own multiple accounts. Schema delta is two fields (ownerId, chatBotMode). Personal accounts and owned accounts are structurally identical — same vault / .claude / memory / loops machinery. "Agent" stays as a UI-facing label only; the data model has no separate agent type. Captures: BUC analogue framing, behavioural differences (login / token issuance / loop ownership), UI grouping rules, chat bot mode hook, v1 API surface deltas, implementation sequence (7 ordered steps), historical iteration log (why each prior iteration didn't work), FAQ.

…iseConfigPanel for UI

…rminology Replace internal "持有账号" naming with "公共账号" (public account), the standard term in OA systems (DingTalk, WeCom, Lark, BUC, AWS IAM, GCP service accounts, GitHub bot accounts). All these systems solve the same problem — one human manages multiple accounts with different purposes — and none of them invent a new "non-human entity" type. Loopat aligns. - Intro table compares 7 OA systems' multi-account patterns - 公共账号 = "non-personal purpose account with a designated owner" (clarifies it's not multi-owner; nuance vs OA convention captured in FAQ) - "Agent" framed as a per-context alias of 公共账号 (like DingTalk calls them 机器人, WeCom calls them 应用账号, AWS calls them Service Accounts) - UI guidance maps to OA admin-panel conventions (员工只看自己 + 自管的; HR/admin 看全员) - New FAQ entry explaining why no "sub-" prefix — resources are structurally identical, "sub" suggests downgrade which is wrong

Per docs/account-model.md (OA-system "multi-account / 公共账号" model): Schema: - User adds optional ownerId field. null = personal account (existing behavior, password login). Non-null = public account (token-only, cannot login, cannot nest, cannot self-issue tokens). - createUser accepts ownerId. Owned accounts: no password, role=member, status=active immediately, no first-admin bootstrap. - POST /api/auth/login explicitly rejects owned accounts (same generic 401 to avoid leaking existence). API surface (v1): - POST /api/v1/me/accounts { id } — human creates public account - GET /api/v1/me/accounts — list accounts caller owns - DELETE /api/v1/me/accounts/{id} — hard delete + cascade tokens - POST /api/v1/me/tokens body adds optional forAccount: issue token for a public account caller owns. token resolves to that account. - GET /api/v1/me/tokens ?forAccount=... — list a public account's tokens - DELETE /api/v1/me/tokens/{tokenId} — searches across owned accounts OpenAPI spec + spec doc updated with the new endpoints. Tests: 15 new in api-v1.test.ts covering create/list/delete + nesting guard + forAccount happy + 403/404 paths + login-rejects-owned. Full suite 292 pass / 1 skip. Web Settings UI (step 6) + e2e (step 7 chain) deferred to next commit.

UI for the public-account model. Lives in Settings as a new "Agents" tab (marketing label) backed by the account data model: - Lists ownerId=me accounts via GET /api/v1/me/accounts - Create form (single-field id input) POSTs /api/v1/me/accounts - Per-row delete confirm → DELETE /api/v1/me/accounts/{id} - Pointer to API docs for the next step (issue tokens) web/src/api.ts: listMyAccounts / createMyAccount / deleteMyAccount + the PublicAccount type. listApiTokens / createApiToken now accept optional forAccount (wired for future cross-tab token UI). e2e: new test verifies the create flow POSTs /api/v1/me/accounts and the agent shows up in the table. Terminology lock-in (per discussion): "account" is the canonical system concept. "user" only refers to the human in conversational text. "Agent" is the UI/marketing label for ownerId-non-null accounts. Code's `User` type is historical, kept as-is.

Per the agreed terminology: "account" is the only system concept; "agent" doesn't appear in the system, UI, code, or docs. Was sloppy to leak "Agents" into the Settings tab label. - Tab id: agents → accounts; URL /settings/agents → /settings/accounts - Tab label: "Agents" → "Accounts" - Description: drops "agent" phrasing, says "additional accounts you own" - Section heading: "Your agents" → "Your accounts" - Empty state, create form heading, just-created banner, delete tooltip all switch to "account" language - e2e test renamed and selectors updated

…ise tools

…ialog

…and ProvidersSection

…nd work

…feedback on stash behavior

…ner recreate Regresses the bug where ensureContainer's 'config hash drift — recreating' path tears down + recreates a loop's sandbox container (SIGKILL 137 on in-flight exec'd processes) when hashCreateArgs drifts between attach/detach. Flow (no chat → zero AI tokens): create loop from roster1, open terminal to trigger ensureContainer, record the running container's Id/StartedAt/CreatedAt via podman inspect, then detach (back to /loop list, closing the term socket) and re-attach (reopen terminal → ensureContainer again), twice. Integration truth: SAME container Id and UNCHANGED StartedAt/CreatedAt across every cycle prove no drift, no teardown+recreate. No product bug found — hashCreateArgs is already stable across attach/detach (extraEnv excluded from the hash; existence-conditional binds stable in the no-chat flow). Test is GREEN.

…e to origin

…ed by integration truth

…+fixture

…se cleanup; full bun run dogfood green Each failing spec derived loopId from the browser URL, which with loops accumulating in the shared LOOPAT_HOME read a STALE loop's id and made the container assertions check the wrong loop. Switch context-notes-sync, first-5-minutes, and multi-turn-task to read the id from the POST /api/v1/loops create RESPONSE (strip loop_ prefix), exactly like second-loop-warm/attach-detach. Add a best-effort test.afterEach in each loop-creating spec that podman rm -f's the loop's sandbox container by loopat.loop-id label so loops/containers don't accumulate between cases. Product fix (genuine bug surfaced by the shared run): ensureContextWorktree ran 'git worktree add -B <branch> <path>' with no start point for an empty context remote (notes.git has no commits). The per-user clone's HEAD is an unborn ref (main); once the repo gains an unrelated branch (master, pushed by an earlier loop) git stops inferring --orphan and dies with 'fatal: invalid reference: HEAD', failing every subsequent loop create. Pass --orphan explicitly when there is no start point so a startless worktree never depends on a resolvable HEAD. Full 'bun run dogfood' now 6 passed.

…race), retry first ssh push

…e provider)

…reate→keys), own empty setup

…er + v1 loop-create onboarding gate fix Rebuilds dogfood's first-run as the true first-time-user cold start (per docs/superpowers/specs/2026-06-03-first-run-journey-redesign.md): empty LOOPAT_HOME, fixture git-host provider (mirrors code.ts against the fixture sshd, all endpoints via env), real browser register/login, onboarding gate, personal-repo setup with real git-crypt, ssh-pubkey seed, context clone, loop, real AI turn, terminal git. Product fix: POST /api/v1/loops (the UI's loop-create path) bypassed the onboarding gate that legacy POST /api/loops enforced — added the same gate. seed.sh: give the notes bare repo an initial commit so the onboarding ssh-access probe's `git ls-remote --exit-code HEAD` succeeds on access.

…erts notes clones

…first ssh attempt now reliable; drop retry mask The intermittent 'Permission denied (publickey)' on a sandbox's FIRST git push/clone/ls-remote was masked by a retry loop in first-5-minutes and was conflated with scary boot-time log noise. Root cause (perms timing): the vault .ssh is git-crypt-decrypted / git-checked-out, so the private key lands at the umask default (0664) and the dir at 0775; git can't carry 0600. loops.ts chmods the key 0600 only 'at point of use' when a HOST git op goes through sshCommandForUser — but a container created/recreated on a path that didn't (server restart -> first attach, config/image-drift recreate) bind-mounts a stale-perms key, so the first in-sandbox ssh can be ignored/rejected; a later host op re-chmods it and the retry succeeds. Fix: ensureSandboxSshPerms() chmods the vault .ssh dir 0700, private keys + config 0600 in ensureContainer() — the one chokepoint every container start passes through — so perms are correct the instant the bind goes live, regardless of checkout path. Cheap, idempotent, best-effort. Also: ensureContextRepo's WORKSPACE-DEFAULT clone uses the host's bare ssh (no vault key), so a private ssh:// URL always failed 'Permission denied (publickey)' at boot and fell back to a local origin (harmless, but the raw error read as a loop-auth bug). Demoted to an info log that names it as the benign bootstrap mirror. Verified: dropped the retry mask from first-5-minutes; full 'bun run dogfood' (6 cases) and 'bun run dogfood:first-run' (REAL git-crypt) both green, with zero 'Permission denied (publickey)' in the logs (was 4/boot).

…ush resolves

…git; doctrine: push to origin = done

…n; tier run scripts + README

…s (S1-S5)

…p, no UI pull)

…one from shared origin)

S6: held-back recovery — product exposes only save/behind/refresh, no discard/force. take-remote converges B's view; keep-mine impossible (sticky commit, always 409); SoT stays clean. S7: deletion converges A->origin->B like an edit.

…ck conflict; sync S6/S7 use it

Run server unit tests on both ubuntu and macOS via matrix strategy when PRs target main. Fork PRs from outside collaborators require manual approval through GitHub Actions settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mingholy · 2026-06-05T09:33:40Z

Final Verdict

APPROVED

Clean, minimal CI workflow with no security concerns — no secrets are referenced and no untrusted PR event data is interpolated into run steps. The matrix strategy with fail-fast: false is good practice for cross-platform coverage. Actions are pinned to major versions (v4/v2) from trusted publishers, which is acceptable. One minor optional hardening for the future: adding an explicit permissions: contents: read block to follow least-privilege principles, but this is not blocking.

Mingholy · 2026-06-07T09:49:20Z

Closing: this PR has merge conflicts with main and cannot be merged as-is.

panlilu and others added 30 commits May 25, 2026 08:23

feat: simplify file picker UI and enhance file search functionality; …

49531d2

…improve file selection experience

feat: implement goal management system; add commands to set, complete…

1f1864d

…, and clear goals; enhance UI to display active goals

feat: add production_start.sh script for infinite-loop server launche…

8441e08

…r; implement server management and logging features

fix: update default host addresses in production_start.sh to 127.0.0.1

c158803

feat: add recursive file listing functionality; implement listDirRecu…

943f827

…rsive and listFilesTree methods for improved file management

feat: add chimp gateway sse

3181ea1

Co-Authored-By: xubai2537 <xubai2537@gmail.com>

refactor: generalize runtime gateway

125454e

Co-Authored-By: xubai2537 <xubai2537@gmail.com>

feat(api): add mise-config endpoints for tier management; implement M…

987161b

…iseConfigPanel for UI

Refactor code structure for improved readability and maintainability

f76a7bd

feat(ui): add fullscreen toggle functionality to editor components

bf588ff

feat(presets): implement admin presets management for providers and m…

9642eaf

…ise tools

fix: prevent submit actions during composition in input fields

77bc9f5

feat(kanban): add refresh functionality to KanbanBoard component

dafae9f

feat(kanban): add loading indicators to action buttons in CardDetailD…

c58aa7f

…ialog

feat(dialog): add success message for save actions in WorkspacePanel …

33048ae

…and ProvidersSection

fix(session): auto-complete goal when stopping model without backgrou…

52ca9cf

…nd work

feat(pull): add force option to discard local changes during pull

8245373

fix(pull): update error handling for merge failures and improve user …

b93b731

…feedback on stash behavior

simpx and others added 26 commits June 3, 2026 01:42

test(dogfood): context-notes-sync journey — loop pushes notes worktre…

83eb935

…e to origin

test(dogfood): multi-turn-task journey — real AI tool-use turn verifi…

db44ade

…ed by integration truth

test(dogfood): run cases serially (workers=1) — all share one backend…

6d03d59

…+fixture

test(dogfood): de-flake first-5-minutes — chat is reply-only (no git …

0eb23d5

…race), retry first ssh push

0.1.48

2dba74d

docs: spec for first-run journey redesign (real onboarding via fixtur…

fffc83a

…e provider)

docs: first-run spec — fixture provider mirrors code.ts (token→list→c…

4323c99

…reate→keys), own empty setup

0.1.49

f315176

test(dogfood): first-run drives UI terminal + reads pubkey from UI

db0be2c

test(dogfood): unify fixture notes url (absolute ssh) + first-run ass…

b4dce1f

…erts notes clones

0.1.50

fb69f99

test(dogfood): concurrent-push — non-ff rejection + fetch/rebase/re-p…

0c8defb

…ush resolves

fix(sandbox): standard origin/<def> tracking ref — ordinary worktree …

07381b1

…git; doctrine: push to origin = done

0.1.51

4ac57e3

test(dogfood): first-run ends with real AI-push + human-push to origi…

fd319e1

…n; tier run scripts + README

test(dogfood): sync tier — context flow across two independent server…

bf2f4a4

…s (S1-S5)

test(dogfood): S2 — assert only UI-exposed truth (kn read needs a loo…

98af199

…p, no UI pull)

test(dogfood): sync S2/S3 — B sees kn+notes at loop level (sandbox cl…

29766fd

…one from shared origin)

feat(notes): take-remote discard endpoint — escape hatch from held-ba…

5702dfc

…ck conflict; sync S6/S7 use it

0.1.52

fc71f9f

ci: add test workflow for PR gate

917918c

Run server unit tests on both ubuntu and macOS via matrix strategy when PRs target main. Fork PRs from outside collaborators require manual approval through GitHub Actions settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mingholy force-pushed the main branch from 96347e8 to f145090 Compare June 7, 2026 09:14

Mingholy closed this Jun 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add test workflow for PR gate#22

ci: add test workflow for PR gate#22
Mingholy wants to merge 589 commits into
mainfrom
ci/add-test-workflow

Mingholy commented Jun 5, 2026

Uh oh!

Mingholy commented Jun 5, 2026

Uh oh!

Mingholy commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Mingholy commented Jun 5, 2026

Summary

权限门禁（需手动配置）

Test plan

Uh oh!

Mingholy commented Jun 5, 2026

Final Verdict

Uh oh!

Mingholy commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants