Support the OAP admin-server REST API (swctl admin ...) and adapt to OAP 11.0.0#228
Merged
Conversation
Add a first-class admin-host (REST) client and a `swctl admin ...` command tree, alongside the existing GraphQL surface, and adapt to the OAP 11.0.0 breaking changes. Admin REST surface (default port 17128): - New global `--admin-url` flag (env SW_ADMIN_URL / config key `admin-url`), derived from `--base-url`'s host with port 17128 when unset. - New pkg/transport (shared TLS/basic-auth) and pkg/admin/client REST client with a typed error envelope and admin-module preflight detection. - `swctl admin ...` covering every feature module: preflight; cluster nodes, config dump/ttl, alarm rules/rule (status); inspect metrics/entities; ui-template list/get/create/update/disable; runtime-rule list/bundled/get/add/inactivate/delete/dump; dsl-debug status/sessions/ session start|get|stop and oal files/file/rules/rule. OAP 11.0.0 adaptations: - alarm list: migrate getAlarm -> queryAlarms, add --layer/--rules filters. - menu get: detect the retired getMenuItems and report a clear message instead of a raw GraphQL error. E2E: - Bump OAP to 11.0.0+, switch storage from Elasticsearch to BanyanDB. - basic: layer list normalized via `yq sort`; trace cases migrated to trace-v2 (BanyanDB rejects the v1 trace API). - New `admin` case (static admin REST) and `live-debugging` case (OAL live capture, asserting the captured pipeline is exactly the bound metric). - New admin-command-tests and live-debugging-tests CI jobs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- live-debugging: drop the per-record `.rule.ruleName` gate-isolation check; the server does not reliably populate the OAL `.rule` envelope. Gate isolation is already verified from the `.dsl` source and the output samples (no output sample for any other metric). - runtime-rule: send the rule file bytes verbatim instead of `TrimSpace(...) + "\n"`. The API hashes the raw body for contentHash / no-change detection, so normalizing whitespace could make a byte-identical rule look changed. - admin client: use net.JoinHostPort when deriving the admin URL so IPv6 base URLs are bracketed correctly (http://[::1]:12800 -> http://[::1]:17128); add an IPv6 unit-test case. - CI: hoist OAP_TAG to a single workflow-level env and drop the single-value matrix from the e2e jobs, so the job names no longer carry the commit SHA. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
kezhenxu94
approved these changes
Jun 3, 2026
This was referenced Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
SkyWalking OAP 11.0.0 introduced an admin-server — a second HTTP surface (default port
17128) separate from the public GraphQL/MQE surface (12800) — that bundles the operator-facing feature modules (status,inspect,ui-management,dsl-debugging,receiver-runtime-rule), all enabled by default. swctl previously spoke only GraphQL and had no concept of the admin host, so operators had tocurlthese endpoints by hand. Several long-standing endpoints (cluster status, effective config, TTL, alarm runtime status) also relocated here in 11.0.0 and were never wrapped by swctl.This PR gives swctl a first-class admin-host REST client and a
swctl admin ...command tree covering every admin feature module, and adapts the existing commands to the OAP 11.0.0 breaking changes.Admin REST surface (default port 17128)
--admin-urlflag (envSW_ADMIN_URL, config keyadmin-url), derived from--base-url's host with port17128when unset.--username/--password/--authorization/--insecureapply to it the same way.pkg/transport(shared TLS / basic-auth, factored out of the GraphQL client) andpkg/admin/client(REST client with a typed error envelope) +pkg/admin/preflight(admin-module feature detection via/debugging/config/dump, with friendly "module not enabled / admin unreachable" messages).swctl admin ...commands, one group per module:admin preflightadmin cluster nodes,admin config dump|ttl,admin alarm rules|ruleadmin inspect metrics|entitiesadmin ui-template list|get|create|update|disableadmin runtime-rule list|bundled|get|add|inactivate|delete|dump(raw-YAML upload,X-Sw-*/ETag/304, tar.gz)admin dsl-debug status|sessions|session start|get|stopandadmin oal files|file|rules|ruleOAP 11.0.0 adaptations
alarm list: migrate the deprecatedgetAlarm→queryAlarms, adding--layerand--rulesfilters.menu get: detect the retiredgetMenuItemsquery and report a clear message ("OAP 11.0.0+ no longer serves the UI menu …") instead of a raw GraphQL error.E2E
basic:layer listmade order-insensitive viayq sort; the trace cases migrated totrace-v2(BanyanDB rejects the v1 trace API: "BanyanDB Trace Model changed, please use queryTraces").admincase (static admin REST) andlive-debuggingcase (OAL live capture drivingadmin dsl-debug session— asserts the captured pipeline is exactly the bound metric: correct source →cpm()→ output, with per-metric gate isolation).admin-command-testsandlive-debugging-testsCI jobs.Verification
All three e2e suites were run locally against the bumped OAP + BanyanDB and pass:
basic(27 checks),admin(11 checks),live-debugging(OAL capture with exact-metric assertions) — plus manual write round-trips against the live backend (ui-template CRUD, runtime-rule add/inactivate/delete lifecycle).go build/go vet/go test/golangci-lint(0 issues) /license-eyeall green.Compatibility
Adds a
>= 0.15.0 → >= 11.0.0row to the compatibility table — the admin commands andqueryAlarmsrequire OAP 11.0.0+.🤖 Generated with Claude Code