Summary
Add an optional self-review phase to the agent pipeline that runs between agent execution and post-hooks. The LLM critiques its own cumulative diff before the PR is created, improving first-pass PR quality by catching bugs, style issues, edge cases, and test gaps that would otherwise go straight to human review.
Motivation
Post-hooks in pipeline.py run deterministic checks (build/lint) after the agent finishes, but the LLM never sees its own cumulative diff before the PR is created. Adding a self-review phase where the model critiques its own diff and iterates on fixes improves first-pass PR quality.
Acceptance Criteria
Design
See the approved plan in the implementation PR for full design details including:
- Second
run_agent() call with fresh review context
- Diff truncation at hunk boundary (60k char cap)
- Budget/turn computation from remaining allocation
- Fail-open error handling
Task Type
new_task
Summary
Add an optional self-review phase to the agent pipeline that runs between agent execution and post-hooks. The LLM critiques its own cumulative diff before the PR is created, improving first-pass PR quality by catching bugs, style issues, edge cases, and test gaps that would otherwise go straight to human review.
Motivation
Post-hooks in
pipeline.pyrun deterministic checks (build/lint) after the agent finishes, but the LLM never sees its own cumulative diff before the PR is created. Adding a self-review phase where the model critiques its own diff and iterates on fixes improves first-pass PR quality.Acceptance Criteria
agent/src/self_review.pymodule withrun_self_review()orchestration functionagent/src/prompts/self_review.pywith focused review prompt templateTaskConfigextended withself_review_enabled(defaultFalse) andself_review_max_turns(default5) fieldsbuild_config(),get_config(),server.py, andpipeline.pypr_reviewtask type, no diff, no remaining turns/budgetmax_turnsallocation (capped atself_review_max_turns)self_review_started,self_review_completeDesign
See the approved plan in the implementation PR for full design details including:
run_agent()call with fresh review contextTask Type
new_task