jira-agent: refresh token before Phase 3 and increase max-turns across all phases#80447
Conversation
Phase 3 (address review findings) pushes code to the fork, but its GitHub App token was generated at job start and can expire after 1 hour. When Phases 1-2 run long, Phase 3's push fails with "Invalid username or token". Add a fork token refresh before Phase 3 starts (matching the existing refresh before Phase 4). Also increase Phase 4 max-turns from 15 to 30. When the push encounters pre-push hook failures, 15 turns is not enough to diagnose, fix, and retry before creating the PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@enxebre: GitHub didn't allow me to request PR reviews from the following users: openshift/hypershift-team. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughThis PR updates a bash orchestration script that manages GitHub App authentication and AI-assisted pull request creation for the HyperShift Jira agent. The script now refreshes the fork token before Phase 3 to avoid expiry during code-addressing work, clarifies token regeneration timing in Phase 4 documentation, and substantially increases Claude's conversational turns across all phases: Phase 1 to 300, Phase 2 to 225, Phase 3 to 225, and Phase 4 to 90. ChangesHyperShift Jira Agent Token Refresh and Claude Configuration
🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh (1)
743-797:⚠️ Potential issue | 🟠 Major | ⚡ Quick winOnly mark the Jira issue processed after PR creation actually succeeds.
From Line 743 onward, the script adds
agent-processed, transitions the issue, incrementsPROCESSED_COUNT, and recordsSUCCESSeven when Lines 703-705 leftPR_URLempty because Phase 4 failed. Since the search JQL excludesagent-processed, those failures will not be retried on the next run and the issue can get stuck without a PR. Gate the Jira mutations and success accounting onPR_URLbeing present, with a separate success path for the intentional “no code changes” case.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh` around lines 743 - 797, The script currently always adds the 'agent-processed' label, transitions the issue, sets assignee, increments PROCESSED_COUNT and writes "SUCCESS" to STATE_FILE even when PR creation failed (PR_URL is empty); wrap the entire Jira mutation + success accounting block (the code using LABEL_RESPONSE / transition_issue / set_assignee, PROCESSED_COUNT increment and the echo to STATE_FILE) in a guard if [ -n "$PR_URL" ]; then ... fi so those actions only run when PR_URL is present; add an explicit else branch that handles the intentional "no code changes" case by writing a distinct state (e.g. "NO_CHANGES") to STATE_FILE or skipping Jira mutations but still incrementing PROCESSED_COUNT if appropriate, and ensure you reference the existing symbols LABEL_RESPONSE, transition_issue, set_assignee, PROCESSED_COUNT, PR_URL and STATE_FILE when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`:
- Around line 540-549: The refresh block for the GitHub App token using
generate_github_token (GITHUB_TOKEN_FORK) only logs failures and continues,
which allows Phase 3/4 to run with expired credentials; update the logic in the
token refresh sections (the GITHUB_TOKEN_FORK and GITHUB_TOKEN_UPSTREAM refresh
blocks around generate_github_token) to either retry token generation a few
times with backoff or immediately fail the script when refresh returns
empty/"null" before entering the push/PR phases—i.e., on failure do a bounded
retry of generate_github_token and if still unsuccessful call exit 1 (or
otherwise abort the run) instead of merely echoing an error so Phase 3/4 never
proceed with stale credentials.
---
Outside diff comments:
In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`:
- Around line 743-797: The script currently always adds the 'agent-processed'
label, transitions the issue, sets assignee, increments PROCESSED_COUNT and
writes "SUCCESS" to STATE_FILE even when PR creation failed (PR_URL is empty);
wrap the entire Jira mutation + success accounting block (the code using
LABEL_RESPONSE / transition_issue / set_assignee, PROCESSED_COUNT increment and
the echo to STATE_FILE) in a guard if [ -n "$PR_URL" ]; then ... fi so those
actions only run when PR_URL is present; add an explicit else branch that
handles the intentional "no code changes" case by writing a distinct state (e.g.
"NO_CHANGES") to STATE_FILE or skipping Jira mutations but still incrementing
PROCESSED_COUNT if appropriate, and ensure you reference the existing symbols
LABEL_RESPONSE, transition_issue, set_assignee, PROCESSED_COUNT, PR_URL and
STATE_FILE when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 28595b10-a32b-49b6-ad45-cb1c24d7657f
📒 Files selected for processing (1)
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh
| # Refresh tokens before Phase 3 since it pushes code. | ||
| # Phases 1-2 can exceed the 1-hour GitHub App token lifetime. | ||
| echo "Refreshing GitHub App tokens before Phase 3..." | ||
| GITHUB_TOKEN_FORK=$(generate_github_token "$INSTALLATION_ID_FORK") | ||
| if [ -z "$GITHUB_TOKEN_FORK" ] || [ "$GITHUB_TOKEN_FORK" = "null" ]; then | ||
| echo "ERROR: Failed to refresh GitHub App token for fork" | ||
| else | ||
| git config --global credential.helper "!f() { echo username=x-access-token; echo password=${GITHUB_TOKEN_FORK}; }; f" | ||
| echo "Fork token refreshed" | ||
| fi |
There was a problem hiding this comment.
Don't continue into Phase 3/4 with a failed token refresh.
Lines 544-549 and Lines 621-633 only log refresh failures, then fall through into the push/PR phases with the previous credentials still configured. In the same long-running case this change is addressing, that leaves Phase 3 pushing with an expired fork token and Phase 4 calling gh with an expired upstream token. Retry the refresh or fail the current issue before entering the dependent phase.
Also applies to: 616-633
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`
around lines 540 - 549, The refresh block for the GitHub App token using
generate_github_token (GITHUB_TOKEN_FORK) only logs failures and continues,
which allows Phase 3/4 to run with expired credentials; update the logic in the
token refresh sections (the GITHUB_TOKEN_FORK and GITHUB_TOKEN_UPSTREAM refresh
blocks around generate_github_token) to either retry token generation a few
times with backoff or immediately fail the script when refresh returns
empty/"null" before entering the push/PR phases—i.e., on failure do a bounded
retry of generate_github_token and if still unsuccessful call exit 1 (or
otherwise abort the run) instead of merely echoing an error so Phase 3/4 never
proceed with stale credentials.
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent |
|
@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent |
|
@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent |
|
@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent |
|
@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Phase 1 (solve): 100 → 300 Phase 2 (review): 75 → 225 Phase 3 (address review): 75 → 225 Phase 4 (PR creation): 30 → 90 Both rehearsal runs of CNTRLPLANE-3612 hit the 100-turn limit — once while pushing after a successful implementation, once while debugging test failures. The agent completed the implementation in both cases but ran out of turns during the verification/push tail. Tripling gives enough headroom for complex feature tickets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent |
|
@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh (1)
540-549:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winToken refresh failure allows Phase 3 to proceed with expired credentials.
This is the same issue flagged in the previous review. If
generate_github_tokenreturns empty or "null" at line 544, the script logs an error but continues into Phase 3 without updating the credential helper. Phase 3's push operations (line 568+) will then fail with the expired fork token—exactly the scenario this refresh was meant to prevent.The same pattern exists at lines 616-633 before Phase 4.
Also applies to: 616-633
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh` around lines 540 - 549, When token generation fails for either the fork refresh (GITHUB_TOKEN_FORK) or the subsequent Phase 4 token refresh, the script logs an error message but continues execution without halting. This allows Phase 3 and Phase 4 to proceed with expired or missing credentials, causing push operations to fail. Fix this by adding an exit statement with a non-zero status code (such as exit 1 or return 1) in the error branch of both token refresh blocks—specifically after the error is logged for the failed token generation. This will halt the script immediately when token refresh fails, preventing Phase 3's push operations and Phase 4's operations from attempting to run with invalid credentials.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`:
- Around line 540-549: When token generation fails for either the fork refresh
(GITHUB_TOKEN_FORK) or the subsequent Phase 4 token refresh, the script logs an
error message but continues execution without halting. This allows Phase 3 and
Phase 4 to proceed with expired or missing credentials, causing push operations
to fail. Fix this by adding an exit statement with a non-zero status code (such
as exit 1 or return 1) in the error branch of both token refresh
blocks—specifically after the error is logged for the failed token generation.
This will halt the script immediately when token refresh fails, preventing Phase
3's push operations and Phase 4's operations from attempting to run with invalid
credentials.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 7273110e-67e7-480f-b3fc-32ce8fc7d1dc
📒 Files selected for processing (1)
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh
|
@enxebre: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/pj-rehearse ack |
|
@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bryan-cox, enxebre The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…s all phases (openshift#80447) * jira-agent: refresh token before Phase 3 and increase Phase 4 max-turns Phase 3 (address review findings) pushes code to the fork, but its GitHub App token was generated at job start and can expire after 1 hour. When Phases 1-2 run long, Phase 3's push fails with "Invalid username or token". Add a fork token refresh before Phase 3 starts (matching the existing refresh before Phase 4). Also increase Phase 4 max-turns from 15 to 30. When the push encounters pre-push hook failures, 15 turns is not enough to diagnose, fix, and retry before creating the PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: increase max-turns 3x for all jira-agent phases Phase 1 (solve): 100 → 300 Phase 2 (review): 75 → 225 Phase 3 (address review): 75 → 225 Phase 4 (PR creation): 30 → 90 Both rehearsal runs of CNTRLPLANE-3612 hit the 100-turn limit — once while pushing after a successful implementation, once while debugging test failures. The agent completed the implementation in both cases but ran out of turns during the verification/push tail. Tripling gives enough headroom for complex feature tickets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…s all phases (openshift#80447) * jira-agent: refresh token before Phase 3 and increase Phase 4 max-turns Phase 3 (address review findings) pushes code to the fork, but its GitHub App token was generated at job start and can expire after 1 hour. When Phases 1-2 run long, Phase 3's push fails with "Invalid username or token". Add a fork token refresh before Phase 3 starts (matching the existing refresh before Phase 4). Also increase Phase 4 max-turns from 15 to 30. When the push encounters pre-push hook failures, 15 turns is not enough to diagnose, fix, and retry before creating the PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: increase max-turns 3x for all jira-agent phases Phase 1 (solve): 100 → 300 Phase 2 (review): 75 → 225 Phase 3 (address review): 75 → 225 Phase 4 (PR creation): 30 → 90 Both rehearsal runs of CNTRLPLANE-3612 hit the 100-turn limit — once while pushing after a successful implementation, once while debugging test failures. The agent completed the implementation in both cases but ran out of turns during the verification/push tail. Tripling gives enough headroom for complex feature tickets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Invalid username or token.--max-turnsacross all phases to give enough headroom for complex tickets and push retries:Root cause
Observed in periodic-jira-agent run 2065130139884195840:
fatal: Authentication failed(token expired)Changes
Token refresh before Phase 3
Added a
generate_github_tokencall and credential helper update immediately before Phase 3 starts. The existing refresh before Phase 4 is kept but with an updated comment clarifying it guards against Phase 3 duration.Max-turns increase
Previous limits were too conservative for complex tickets. Phases would exhaust turns before completing work, especially when pre-push hooks required multiple retry cycles. The new limits (3x for phases 1-3, 6x for phase 4) provide sufficient headroom based on observed runs.
/cc @openshift/hypershift-team
🤖 Generated with Claude Code