ci: add test workflow for PR gate#22
Closed
Mingholy wants to merge 589 commits into
Closed
Conversation
…improve file selection experience
…, and clear goals; enhance UI to display active goals
…r; implement server management and logging features
…rsive and listFilesTree methods for improved file management
MCP tokens were stored under a derived `MCP_<NAME>_TOKEN` convention,
surfaced via a dedicated Settings page panel, and discovered through
plugin `.mcp.json` walking. After this change, MCP tokens are
indistinguishable from any other vault env: server config goes in
tier `settings.json` mcpServers (single source = loop's merged
settings.json), and the env name is parsed directly from the server's
`Authorization: Bearer ${VAR}` header.
- Drop `mcpServerEnvVarName` / `.mcp.json` reads; add strict
`parseBearerEnvName` that requires `^Bearer ${VAR}$` (case-insensitive
Bearer, uppercase env ref, single ref only)
- `/api/mcp-servers` returns a flat list from merged settings.json with
`authTokenEnv` + `authed` (existence-only check, no validity probe)
- Replace `GET /api/mcp-auth` + `DELETE /api/mcp-auth/:server` with the
generic `DELETE /api/envs/:name`
- Compose fills `mcpServers` defaults from each enabled plugin's own
`settings.json` (plugins are lowest priority — team/profile/personal
always win on same key)
- Drop Settings → MCP tab; popover (/mcp) becomes the only entry point,
with auth / re-auth / forget actions
Pairs with production_start.sh / start.sh. Targets the dev/prod server
tree (bun run dev → --hot index.ts + vite, or production_start.sh), plus
sandboxed bwrap children. Conservative matching: cmd-line patterns
anchored to the repo / LOOPAT_HOME, /proc/*/cwd inside the repo
filtered to {bun,node,vite}. Excludes `claude` so the user's
interactive CLI session in this dir survives.
SIGTERM first, escalates to SIGKILL after 7.5s. --dry-run previews.
Co-Authored-By: xubai2537 <xubai2537@gmail.com>
Co-Authored-By: xubai2537 <xubai2537@gmail.com>
- Add per-user gateway tokens (gateway-tokens.ts) replacing shared env-var auth - Rework external-gateway.ts: remove recentTurns, persist metadata/traceId to loop meta, add mutex lock for thread-loop mapping, use session internal queue instead of 409 rejection, authenticate via per-user tokens - Add gateway token management API (GET/POST/DELETE /api/gateway-tokens) - Add Gateway Tokens tab in Settings UI for token CRUD - Add lastExternalMeta to LoopMeta type - Remove all LOOPAT_GATEWAY_* env-var references from loopat codebase - Rename mock env to LOOPAT_RUNTIME_MOCK Co-Authored-By: xubai2537 <xubai2537@gmail.com>
Adds /api/v1/loops/* (CRUD + SSE messages + watch events + choices + interrupt) and /api/v1/me/tokens for hashed-at-rest API tokens with stable tokenId. Same endpoints accept either session cookie (web) or Bearer token (bots) — see docs/api-v1.md. Web NewLoopDialog now POSTs /api/v1/loops; Settings → API tokens uses v1 too. Chat streaming (useLoopRuntime / WS) is deliberately deferred. Removes the older /api/runtime/v1/turn/stream + /api/gateway-tokens endpoints (cherry-picked from internal CR) — superseded by v1. Tests: - 31 new bun integration tests (auth, CRUD, isolation, validation) - 1 new Playwright e2e: create-loop dialog hits /api/v1/loops - All existing 246 server tests + 9 e2e remain green
…r features)
- useLoopRuntime opens v1 SSE GET /events as a parallel live channel and
dedupes vs WS by SDK uuid. Send paths (user message, interrupt, choice
answer) now hit /api/v1/messages / interrupt / choices/{id} instead of
ws.send. Operator features (queue, goal, provider, thinking, clear)
stay on WS — they're not chat semantics, per spec.
- v1 POST /messages accepts permission_mode for parity with the web's
PlanMode toggle.
- v1 SSE emits sdk_message pass-through of raw SDK messages, marked in
spec as web-internal / unstable shape; stable bot-facing events
(assistant_delta, requires_choice, etc.) unchanged.
- e2e: new test exercises full v1 chain — opens loop, confirms
/events SSE subscription fires, sends a message, confirms POST hits
/api/v1/loops/{id}/messages with content + permission_mode.
The spec body now documents the cookie-only /me/tokens endpoints
(token create/list/revoke) that actually exist in the implementation.
New "现状 (Implementation Status, 2026-05-26)" section captures:
- Endpoint × (spec / impl / test / web-wired) status table
- Web hybrid transport diagram (WS read + SSE read with uuid dedupe;
v1 POST for chat writes; ws.send for operator features)
- List of non-v1 endpoints still active and who uses them
- Known gaps with prioritised TODO (decouple WS read, REST-ify
operator features, add files attachment to v1, idempotency
persistence, etc.)
- Key files + line counts
- Conventions for future edits (spec-first, snake_case at boundary,
metadata invisible to sandbox)
…eway Tokens → API Tokens - New `server/src/api-v1-openapi.ts` — hand-written OpenAPI 3.1 schema mirroring docs/api-v1.md (loops CRUD, messages SSE, events SSE, choices, interrupt, /me/tokens). Source of truth stays the markdown; this is the machine-readable mirror. - GET /api/v1/openapi.json — serves the spec - GET /api/v1/docs — Scalar interactive reference - Both public (no auth) so doc URLs are shareable Settings UI: - Tab id "gateway-tokens" → "api-tokens" - Label "Gateway Tokens" → "API Tokens" - Description updated to point at the Loop API - New "Loop API documentation" card above the token list with an "Open docs →" link to /api/v1/docs Dep: @scalar/hono-api-reference@0.10.19
…estion, resolved) Settles 5 rounds of "what is an agent" discussion into a single model: one human can own multiple accounts. Schema delta is two fields (ownerId, chatBotMode). Personal accounts and owned accounts are structurally identical — same vault / .claude / memory / loops machinery. "Agent" stays as a UI-facing label only; the data model has no separate agent type. Captures: BUC analogue framing, behavioural differences (login / token issuance / loop ownership), UI grouping rules, chat bot mode hook, v1 API surface deltas, implementation sequence (7 ordered steps), historical iteration log (why each prior iteration didn't work), FAQ.
…iseConfigPanel for UI
…rminology Replace internal "持有账号" naming with "公共账号" (public account), the standard term in OA systems (DingTalk, WeCom, Lark, BUC, AWS IAM, GCP service accounts, GitHub bot accounts). All these systems solve the same problem — one human manages multiple accounts with different purposes — and none of them invent a new "non-human entity" type. Loopat aligns. - Intro table compares 7 OA systems' multi-account patterns - 公共账号 = "non-personal purpose account with a designated owner" (clarifies it's not multi-owner; nuance vs OA convention captured in FAQ) - "Agent" framed as a per-context alias of 公共账号 (like DingTalk calls them 机器人, WeCom calls them 应用账号, AWS calls them Service Accounts) - UI guidance maps to OA admin-panel conventions (员工只看自己 + 自管的; HR/admin 看全员) - New FAQ entry explaining why no "sub-" prefix — resources are structurally identical, "sub" suggests downgrade which is wrong
Per docs/account-model.md (OA-system "multi-account / 公共账号" model):
Schema:
- User adds optional ownerId field. null = personal account (existing
behavior, password login). Non-null = public account (token-only,
cannot login, cannot nest, cannot self-issue tokens).
- createUser accepts ownerId. Owned accounts: no password, role=member,
status=active immediately, no first-admin bootstrap.
- POST /api/auth/login explicitly rejects owned accounts (same generic
401 to avoid leaking existence).
API surface (v1):
- POST /api/v1/me/accounts { id } — human creates public account
- GET /api/v1/me/accounts — list accounts caller owns
- DELETE /api/v1/me/accounts/{id} — hard delete + cascade tokens
- POST /api/v1/me/tokens body adds optional forAccount: issue token for
a public account caller owns. token resolves to that account.
- GET /api/v1/me/tokens ?forAccount=... — list a public account's tokens
- DELETE /api/v1/me/tokens/{tokenId} — searches across owned accounts
OpenAPI spec + spec doc updated with the new endpoints.
Tests: 15 new in api-v1.test.ts covering create/list/delete + nesting
guard + forAccount happy + 403/404 paths + login-rejects-owned. Full
suite 292 pass / 1 skip.
Web Settings UI (step 6) + e2e (step 7 chain) deferred to next commit.
UI for the public-account model. Lives in Settings as a new "Agents" tab
(marketing label) backed by the account data model:
- Lists ownerId=me accounts via GET /api/v1/me/accounts
- Create form (single-field id input) POSTs /api/v1/me/accounts
- Per-row delete confirm → DELETE /api/v1/me/accounts/{id}
- Pointer to API docs for the next step (issue tokens)
web/src/api.ts: listMyAccounts / createMyAccount / deleteMyAccount + the
PublicAccount type. listApiTokens / createApiToken now accept optional
forAccount (wired for future cross-tab token UI).
e2e: new test verifies the create flow POSTs /api/v1/me/accounts and
the agent shows up in the table.
Terminology lock-in (per discussion): "account" is the canonical system
concept. "user" only refers to the human in conversational text. "Agent"
is the UI/marketing label for ownerId-non-null accounts. Code's `User`
type is historical, kept as-is.
Per the agreed terminology: "account" is the only system concept; "agent" doesn't appear in the system, UI, code, or docs. Was sloppy to leak "Agents" into the Settings tab label. - Tab id: agents → accounts; URL /settings/agents → /settings/accounts - Tab label: "Agents" → "Accounts" - Description: drops "agent" phrasing, says "additional accounts you own" - Section heading: "Your agents" → "Your accounts" - Empty state, create form heading, just-created banner, delete tooltip all switch to "account" language - e2e test renamed and selectors updated
…and ProvidersSection
…feedback on stash behavior
…ner recreate Regresses the bug where ensureContainer's 'config hash drift — recreating' path tears down + recreates a loop's sandbox container (SIGKILL 137 on in-flight exec'd processes) when hashCreateArgs drifts between attach/detach. Flow (no chat → zero AI tokens): create loop from roster1, open terminal to trigger ensureContainer, record the running container's Id/StartedAt/CreatedAt via podman inspect, then detach (back to /loop list, closing the term socket) and re-attach (reopen terminal → ensureContainer again), twice. Integration truth: SAME container Id and UNCHANGED StartedAt/CreatedAt across every cycle prove no drift, no teardown+recreate. No product bug found — hashCreateArgs is already stable across attach/detach (extraEnv excluded from the hash; existence-conditional binds stable in the no-chat flow). Test is GREEN.
…ed by integration truth
…se cleanup; full bun run dogfood green Each failing spec derived loopId from the browser URL, which with loops accumulating in the shared LOOPAT_HOME read a STALE loop's id and made the container assertions check the wrong loop. Switch context-notes-sync, first-5-minutes, and multi-turn-task to read the id from the POST /api/v1/loops create RESPONSE (strip loop_ prefix), exactly like second-loop-warm/attach-detach. Add a best-effort test.afterEach in each loop-creating spec that podman rm -f's the loop's sandbox container by loopat.loop-id label so loops/containers don't accumulate between cases. Product fix (genuine bug surfaced by the shared run): ensureContextWorktree ran 'git worktree add -B <branch> <path>' with no start point for an empty context remote (notes.git has no commits). The per-user clone's HEAD is an unborn ref (main); once the repo gains an unrelated branch (master, pushed by an earlier loop) git stops inferring --orphan and dies with 'fatal: invalid reference: HEAD', failing every subsequent loop create. Pass --orphan explicitly when there is no start point so a startless worktree never depends on a resolvable HEAD. Full 'bun run dogfood' now 6 passed.
…race), retry first ssh push
…reate→keys), own empty setup
…er + v1 loop-create onboarding gate fix Rebuilds dogfood's first-run as the true first-time-user cold start (per docs/superpowers/specs/2026-06-03-first-run-journey-redesign.md): empty LOOPAT_HOME, fixture git-host provider (mirrors code.ts against the fixture sshd, all endpoints via env), real browser register/login, onboarding gate, personal-repo setup with real git-crypt, ssh-pubkey seed, context clone, loop, real AI turn, terminal git. Product fix: POST /api/v1/loops (the UI's loop-create path) bypassed the onboarding gate that legacy POST /api/loops enforced — added the same gate. seed.sh: give the notes bare repo an initial commit so the onboarding ssh-access probe's `git ls-remote --exit-code HEAD` succeeds on access.
…erts notes clones
…first ssh attempt now reliable; drop retry mask The intermittent 'Permission denied (publickey)' on a sandbox's FIRST git push/clone/ls-remote was masked by a retry loop in first-5-minutes and was conflated with scary boot-time log noise. Root cause (perms timing): the vault .ssh is git-crypt-decrypted / git-checked-out, so the private key lands at the umask default (0664) and the dir at 0775; git can't carry 0600. loops.ts chmods the key 0600 only 'at point of use' when a HOST git op goes through sshCommandForUser — but a container created/recreated on a path that didn't (server restart -> first attach, config/image-drift recreate) bind-mounts a stale-perms key, so the first in-sandbox ssh can be ignored/rejected; a later host op re-chmods it and the retry succeeds. Fix: ensureSandboxSshPerms() chmods the vault .ssh dir 0700, private keys + config 0600 in ensureContainer() — the one chokepoint every container start passes through — so perms are correct the instant the bind goes live, regardless of checkout path. Cheap, idempotent, best-effort. Also: ensureContextRepo's WORKSPACE-DEFAULT clone uses the host's bare ssh (no vault key), so a private ssh:// URL always failed 'Permission denied (publickey)' at boot and fell back to a local origin (harmless, but the raw error read as a loop-auth bug). Demoted to an info log that names it as the benign bootstrap mirror. Verified: dropped the retry mask from first-5-minutes; full 'bun run dogfood' (6 cases) and 'bun run dogfood:first-run' (REAL git-crypt) both green, with zero 'Permission denied (publickey)' in the logs (was 4/boot).
…git; doctrine: push to origin = done
…n; tier run scripts + README
…one from shared origin)
S6: held-back recovery — product exposes only save/behind/refresh, no discard/force. take-remote converges B's view; keep-mine impossible (sticky commit, always 409); SoT stays clean. S7: deletion converges A->origin->B like an edit.
…ck conflict; sync S6/S7 use it
Run server unit tests on both ubuntu and macOS via matrix strategy when PRs target main. Fork PRs from outside collaborators require manual approval through GitHub Actions settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Owner
Author
Final VerdictAPPROVED Clean, minimal CI workflow with no security concerns — no secrets are referenced and no untrusted PR event data is interpolated into |
Owner
Author
|
Closing: this PR has merge conflicts with main and cannot be merged as-is. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.github/workflows/ci.yml,在 PR 提交到main时自动运行server单元测试ubuntu-latest+macos-latest双平台fail-fast: false确保单平台失败不阻断另一平台结果权限门禁(需手动配置)
Workflow 文件就绪后,需要在 repo Settings 中配置:
Settings → Actions → General → Fork pull request workflows from outside collaborators
→ 选择 "Require approval for all outside collaborators"
效果:
Test plan
🤖 Generated with Claude Code