Skip to content

infra: agent-friendly docs spec v0.5.0 compliance gaps #181

@marc0olo

Description

@marc0olo

Background

Ran npx afdocs check https://docs.internetcomputer.org against spec v0.5.0 (2026-04-25).
Result: 14 passed, 1 warning, 6 failed out of 23 checks.

Full spec: https://agentdocsspec.com/spec/

CLI output

content-discoverability
  ✓ llms-txt-exists
  ✓ llms-txt-valid
  ✓ llms-txt-size: 26,425 chars
  ⚠ llms-txt-links-resolve: 48/50 resolve (2 broken)
  ✓ llms-txt-links-markdown: 50/50
  ✓ llms-txt-directive-html: found in all 49 sampled pages
  ✗ llms-txt-directive-md: not found in any of 50 sampled pages

markdown-availability
  ✓ markdown-url-support: 50/50
  ✗ content-negotiation: server ignores Accept: text/markdown (0/50)

page-size
  ✓ rendering-strategy
  ✗ page-size-markdown: 1 page exceeds 100K chars (max 480K)
  ✗ page-size-html: 1 page converts to 524K markdown (79% boilerplate)
  ✗ content-start-position: 34/50 pages have content starting past 50% (worst 91%)

content-structure
  ✓ tabbed-content-serialization
  ○ section-header-quality: skipped
  ✓ markdown-code-fence-validity: 691 fences ok

url-stability
  ✓ http-status-codes
  ✓ redirect-behavior

observability
  ✓ llms-txt-coverage: 100% of 152 sitemap pages
  ✗ markdown-content-parity: 1 page has substantive differences
  ✓ cache-header-hygiene

authentication
  ✓ auth-gate-detection
  ○ auth-alternative-access: n/a

Action items

P1 — Easy wins (code changes in plugins/astro-agent-docs.mjs and a new public/_headers)

  • llms-txt-directive-md — inject > For the complete documentation index, see [llms.txt](/llms.txt) near the top of every generated .md file (inside cleanMarkdown()). New in spec v0.5.0.
  • content-negotiation — add a public/_headers file that sets Content-Type: text/markdown; charset=utf-8 for /*.md responses. Fixing this may also resolve content-start-position as a side effect: once the checker gets text/markdown responses it will use our clean .md files (which start with content immediately) instead of converting Starlight's nav-heavy HTML.

P2 — Identify and fix the oversized page

  • page-size-markdown / page-size-html — find the page producing 480K chars and either split it or trim content. Likely candidates: IC interface spec or management canister reference.

P3 — Investigate after P1 is shipped

  • content-start-position — re-run checker after fixing content-negotiation. If the checker then uses .md files for this check, the 34 failures should drop to 0. If not, needs further investigation into Starlight's HTML structure.
  • markdown-content-parity — identify which page has content drift and fix the stripMdx() output.
  • llms-txt-links-resolve (warning) — find and fix the 2 broken .md links in llms.txt.

Notes

  • content-start-position failing for 34/50 pages is most likely because the checker is converting Starlight's HTML (which has full sidebar nav before <main>) rather than using our clean .md files. Fixing content-negotiation is the right lever to pull first.
  • Branch convention for this work: infra/agent-docs-spec-compliance

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions