Skip to content

Add dynamic browser runtime capture proof annotations#114

Open
PatrickSys wants to merge 1 commit into
mainfrom
feat/dynamic-workflows-runtime-capture
Open

Add dynamic browser runtime capture proof annotations#114
PatrickSys wants to merge 1 commit into
mainfrom
feat/dynamic-workflows-runtime-capture

Conversation

@PatrickSys

Copy link
Copy Markdown
Owner

Summary

  • Adds optional UI proof runtime_capture_requirements and runtime_capture metadata for dynamic browser evidence workflows.
  • Validates provider-neutral capture modes, availability, budgets, provider fallback, artifact refs, and slot-linked captures without adding browser tooling.
  • Updates workflows, agent roles, docs, plan evidence, and synthetic fixtures while keeping raw browser artifacts local-only.

Public-safety notes

  • No raw screenshot, DOM, trace, video, or browser output artifact is tracked.
  • Private goal.md remains local and untracked; committed docs do not reference it.
  • agent-browser remains the default live UI proof path; direct-CDP and MCP paths remain explicit escalation metadata only.

Verification

  • rtk npm test
  • rtk node bin/gsdd.mjs ui-proof compare fixtures/ui-proof/browser-runtime-capture-slots.json fixtures/ui-proof/browser-runtime-capture-bundle.json
  • rtk git diff --check
  • rtk git diff --cached --check

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds benchmarkable browser-runtime capture annotations to Workspine's UI proof contract, allowing execution and verification to record browser evidence, provider paths, costs, and budgets without requiring browser infrastructure. It introduces optional runtime_capture_requirements for planned slots and runtime_capture metadata for observed bundles, alongside comprehensive validation, comparison, and verification logic. The documentation, agent roles, and tests have been updated to support these changes. One medium-severity logic issue was found in bin/lib/ui-proof.mjs where the aggregation of screenshot_count can produce incorrect totals when mixing explicit and implicit counts or when explicitly set to zero.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread bin/lib/ui-proof.mjs
Comment on lines +721 to +724
if (metric === 'screenshot_count') {
const explicitCount = captures.reduce((sum, capture) => sum + (typeof capture.screenshot_count === 'number' ? capture.screenshot_count : 0), 0);
return explicitCount > 0 ? explicitCount : captures.filter((capture) => capture.mode === 'screenshot').length;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a logic issue in how screenshot_count is aggregated when there is a mix of explicit and implicit screenshot counts, or when screenshot_count is explicitly set to 0.

Specifically:

  1. If a capture has mode: 'screenshot' and screenshot_count: 0 explicitly, explicitCount will be 0. Since explicitCount > 0 is false, the function falls back to counting the number of screenshot captures, returning 1 instead of 0.
  2. If one capture has no explicit count (implicit 1) and another has screenshot_count: 2, explicitCount is 2. Since explicitCount > 0 is true, the function returns 2, completely ignoring the implicit screenshot from the first capture.

We can resolve this by cleanly accumulating the count for each capture: using screenshot_count if it is a number, otherwise defaulting to 1 if the mode is 'screenshot'.

Suggested change
if (metric === 'screenshot_count') {
const explicitCount = captures.reduce((sum, capture) => sum + (typeof capture.screenshot_count === 'number' ? capture.screenshot_count : 0), 0);
return explicitCount > 0 ? explicitCount : captures.filter((capture) => capture.mode === 'screenshot').length;
}
if (metric === 'screenshot_count') {
return captures.reduce((sum, capture) => {
if (typeof capture.screenshot_count === 'number') {
return sum + capture.screenshot_count;
}
return sum + (capture.mode === 'screenshot' ? 1 : 0);
}, 0);
}

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 417bee7c96

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread bin/lib/ui-proof.mjs
Comment on lines +722 to +723
const explicitCount = captures.reduce((sum, capture) => sum + (typeof capture.screenshot_count === 'number' ? capture.screenshot_count : 0), 0);
return explicitCount > 0 ? explicitCount : captures.filter((capture) => capture.mode === 'screenshot').length;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count implicit screenshots when enforcing screenshot budgets

When a slot has screenshot_count_max and the observed captures mix explicit and implicit screenshot counts, this returns only the sum of explicit screenshot_count values and ignores additional mode: "screenshot" captures without that field. For example, one screenshot capture with screenshot_count: 1 plus two screenshot captures that omit the field is counted as 1 instead of 3, so an over-budget proof can still compare as satisfied.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant