Skip to content

⚡ Copilot Token Optimization2026-04-15 — build-test.md #2002

@github-actions

Description

@github-actions

Target Workflow: build-test.md

Source report: #2000
Estimated cost per run: N/A (token counts from AWF api-proxy; cost not populated)
Total tokens per run (avg): ~621K (4,344K / 7 data runs)
Cache hit rate: 88.5% — already excellent
LLM turns (avg): 13.3 requests/run (range: 10–18)
Model: claude-sonnet-4.6 via Copilot
Total daily token spend: 4,344K (most expensive workflow today)


Current Configuration

Setting Value
Tools loaded bash: ["*"] + github: MCP (all tools, ~22)
GitHub toolsets None specified — loads full default toolset
Network groups 12 groups: defaults, github, node, go, rust, crates.io, java, dotnet, bun.sh, deno.land, jsr.io, dl.deno.land
Runtimes node 20, go 1.22, rust stable, java 21, dotnet 8.0
Pre-agent steps ❌ None
Prompt size 8,592 bytes / 203 body lines
Avg tokens/request ~46.7K (large growing context)

Root cause of high token count: All setup operations (2 runtime installs, 8 repo clones, 1 Maven config) are invoked by the LLM as bash tool calls, adding accumulated context to every subsequent turn. With no pre-agent steps, each of these ~11 deterministic operations consumes LLM context budget unnecessarily.


Recommendations

1. Move Deterministic Setup to steps: (Pre-Agent)

Estimated savings: ~190–250K tokens/run (~30–40% reduction)

The agent currently invokes bash for all of these operations inside the conversation:

  • Install Bun via curl (1+ LLM turns)
  • Install Deno via curl (1+ LLM turns)
  • gh repo clone for 8 repositories (up to 8 LLM turns)
  • Create ~/.m2/settings.xml for Maven (1 LLM turn)

These 11 operations are fully deterministic — they never need LLM reasoning. Moving them to steps: means:

  1. They execute before the agent starts (no LLM turns consumed)
  2. Their output does not accumulate in the conversation history
  3. Subsequent turns have a smaller context, reducing tokens on every later call

Token savings mechanics: At ~47K tokens/request and ~13 turns/run, eliminating 4–6 early turns saves: direct turn savings (~200K) + compounded context reduction (~30–50K) = ~230–250K tokens/run.

Implementation — add a steps: block to the YAML frontmatter:

steps:
  - name: Install Bun
    run: |
      curl -fsSL (bun.sh/redacted) | bash
      echo "BUN_INSTALL=$HOME/.bun" >> "$GITHUB_ENV"
      echo "$HOME/.bun/bin" >> "$GITHUB_PATH"

  - name: Install Deno
    run: |
      curl -fsSL (deno.land/redacted) | sh
      echo "DENO_INSTALL=$HOME/.deno" >> "$GITHUB_ENV"
      echo "$HOME/.deno/bin" >> "$GITHUB_PATH"

  - name: Clone test repositories
    env:
      GH_TOKEN: $\{\{ secrets.GH_AW_GITHUB_MCP_SERVER_TOKEN }}
    run: |
      RESULTS=""
      for spec in \
        "Mossaka/gh-aw-firewall-test-bun:/tmp/test-bun:Bun" \
        "Mossaka/gh-aw-firewall-test-cpp:/tmp/test-cpp:C++" \
        "Mossaka/gh-aw-firewall-test-deno:/tmp/test-deno:Deno" \
        "Mossaka/gh-aw-firewall-test-dotnet:/tmp/test-dotnet:.NET" \
        "Mossaka/gh-aw-firewall-test-go:/tmp/test-go:Go" \
        "Mossaka/gh-aw-firewall-test-java:/tmp/test-java:Java" \
        "Mossaka/gh-aw-firewall-test-node:/tmp/test-node:Node.js" \
        "Mossaka/gh-aw-firewall-test-rust:/tmp/test-rust:Rust"; do
        repo=$(echo "$spec" | cut -d: -f1)
        path=$(echo "$spec" | cut -d: -f2)
        name=$(echo "$spec" | cut -d: -f3)
        if gh repo clone "$repo" "$path" 2>/dev/null; then
          RESULTS+="$name: cloned ✅\n"
        else
          RESULTS+="$name: CLONE_FAILED ❌\n"
        fi
      done
      printf "%b" "$RESULTS"
      echo "CLONE_RESULTS<<EOF" >> "$GITHUB_ENV"
      printf "%b" "$RESULTS" >> "$GITHUB_ENV"
      echo "EOF" >> "$GITHUB_ENV"

  - name: Configure Maven proxy
    run: |
      mkdir -p ~/.m2
      cat > ~/.m2/settings.xml << 'SETTINGS'
      <settings>
        <proxies>
          <proxy>
            <id>awf-http</id><active>true</active><protocol>http</protocol>
            <host>squid-proxy</host><port>3128</port>
          </proxy>
          <proxy>
            <id>awf-https</id><active>true</active><protocol>https</protocol>
            <host>squid-proxy</host><port>3128</port>
          </proxy>
        </proxies>
      </settings>
      SETTINGS

Update the prompt to reference pre-cloned paths and inform the agent of clone outcomes:

Replace the per-task clone instructions with:

**SETUP ALREADY DONE (pre-agent steps):**
- Bun is installed at `$BUN_INSTALL/bin/bun`
- Deno is installed at `$DENO_INSTALL/bin/deno`
- Maven `~/.m2/settings.xml` is configured
- Clone results: $\{\{ env.CLONE_RESULTS }}

Test repos are at `/tmp/test-{bun,cpp,deno,dotnet,go,java,node,rust}`.
For any ecosystem marked CLONE_FAILED, record that status and skip its tests.

This also lets the agent skip CLONE_FAILED ecosystems without trying (a bash turn per failed ecosystem that currently still occurs).


2. Remove GitHub MCP Toolset (or restrict with toolsets:)

Estimated savings: ~15–20K tokens/run (~2–3%)

The github: tool loads the full GitHub MCP server (~22 tools). Looking at what the workflow actually needs:

  • Repo cloning → gh repo clone in bash (no MCP needed)
  • Posting PR comment → safe-outputs: add-comment (no MCP needed)
  • Adding label → safe-outputs: add-labels (no MCP needed)
  • Detecting PR vs workflow_dispatch trigger → available as $\{\{ github.event_name }} in pre-steps

Option A (preferred): Remove github: tools entirely.

# Before:
tools:
  bash:
    - "*"
  github:
    github-token: "$\{\{ secrets.GH_AW_GITHUB_MCP_SERVER_TOKEN }}"

# After:
tools:
  bash:
    - "*"

And pass the trigger context to the agent via a pre-step:

steps:
  - name: Set trigger context
    run: |
      echo "TRIGGER_EVENT=$\{\{ github.event_name }}" >> "$GITHUB_ENV"
      echo "PR_NUMBER=$\{\{ github.event.pull_request.number || '' }}" >> "$GITHUB_ENV"

Then update the prompt's conditional label logic to reference $TRIGGER_EVENT and $PR_NUMBER.

Option B (minimal change): Restrict with toolsets:

If GitHub MCP is needed for any PR reads, restrict scope:

tools:
  bash:
    - "*"
  github:
    github-token: "$\{\{ secrets.GH_AW_GITHUB_MCP_SERVER_TOKEN }}"
    toolsets: [pull_requests]

This reduces from ~22 tools to ~5, saving ~10K tokens/turn in system prompt.


3. Trim Network Groups to Actually-Needed Ones

Estimated savings: ~0 tokens (runtime cost, not token cost) — but reduces attack surface

Currently loads 12 network groups. The builds that run:

  • Bun → bun.sh ✅ needed
  • C++ → no package manager network needed (cmake + local sources) — may not need node group
  • Deno → deno.land, jsr.io, dl.deno.land ✅ needed
  • .NET → dotnet ✅ needed
  • Go → go ✅ needed
  • Java (Maven) → java ✅ needed
  • Node.js → node ✅ needed
  • Rust (Cargo) → rust, crates.io ✅ needed

Network groups don't directly impact token counts but removing unused ones (bun.sh for deno-only tasks, etc.) reduces potential exfiltration surface.


Expected Impact

Metric Current Projected Savings
Total tokens/run ~621K ~380–430K −31–39%
LLM turns/run ~13.3 ~7–9 −4–6 turns
Effective tokens/run ~694K ~430–480K −31–38%
Daily total (7 runs) 4,344K ~2,700–3,000K ~1,300–1,600K saved/day
Input:Output ratio 99:1 ~99:1 (unchanged)
Cache hit rate 88.5% ~88–90% (stable)

The high cache rate (88.5%) means further cache tuning offers diminishing returns. The primary lever is reducing the number of LLM turns by moving deterministic work out of the agent.


Implementation Checklist

  • Add steps: block with: Install Bun, Install Deno, Clone all 8 repos, Configure Maven
  • Update each task's instructions to remove the clone/install commands (repos are pre-cloned)
  • Add clone result status to agent prompt via $\{\{ env.CLONE_RESULTS }}
  • Remove github: from tools: (or restrict with toolsets: [pull_requests])
  • Add TRIGGER_EVENT and PR_NUMBER env vars to pre-steps for conditional label logic
  • Recompile: gh aw compile .github/workflows/build-test.md
  • Post-process: npx tsx scripts/ci/postprocess-smoke-workflows.ts
  • Trigger a test PR and compare turn count and token usage against baseline (target: <9 turns/run)
  • Monitor for 3 runs; update baseline in next token usage report

Generated by Daily Copilot Token Optimization Advisor · Source: #2000

Generated by Daily Copilot Token Optimization Advisor · ● 528.6K ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions