Skip to content

Add CUDN pod churn and CUDN churn periodic tests for 5.0#80485

Open
mohit-sheth wants to merge 1 commit into
openshift:mainfrom
mohit-sheth:cudn-churn-tests
Open

Add CUDN pod churn and CUDN churn periodic tests for 5.0#80485
mohit-sheth wants to merge 1 commit into
openshift:mainfrom
mohit-sheth:cudn-churn-tests

Conversation

@mohit-sheth

@mohit-sheth mohit-sheth commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary

Test plan

  • pj-rehearse both new jobs
  • Revert e2e-benchmarking fork URL after upstream kube-burner-ocp PR merges

Summary by CodeRabbit

This PR extends the OpenShift CI perfscale AWS 5.0 nightly periodic suite by adding two new 24-node periodic jobs to validate kube-burner-ocp CUDN churn behavior via the existing openshift-qe-cudn-density test chain.

  • cudn-pod-churn-250-24nodes (cron 0 6 * * *) runs on aws-perfscale-qe with 21 additional worker nodes and sets pod churn parameters: 50% churn, 30m duration, 1m delay (with OVERRIDE_ITERATIONS=250).
  • cudn-churn-250-24nodes (cron 0 8 * * *) runs on the same cluster profile and sets CUDN group churn parameters: 10% churn, 3 cycles, 10m per cycle (also with OVERRIDE_ITERATIONS=250).

Both jobs use the same compute/infra instance type configuration (m6a.2xlarge for compute/control-plane and r5.2xlarge for infra) and execute the CUDN density chain with the usual pre/post step wiring (openshift-qe-workers-scaleipi-aws-pre + ingress/monitoring registry creation, then ipi-aws-post).

To enable validation while dependent upstream changes are in-flight, the CUDN density step script temporarily points e2e-benchmarking to a maintained fork: it now clones https://github.com/mohit-sheth/e2e-benchmarking and checks out the use-fork-kube-burner-ocp branch (instead of the prior upstream repo/tag logic). The fork reference is intended to be reverted back to the upstream repository after the corresponding kube-burner-ocp PR is merged.

The PR includes (and the author has been exercising) pj-rehearse runs for both new periodic jobs to confirm the periodic job definitions execute correctly in the rehearse environment.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 41465e18-63b3-4cfc-a478-82f126fe0e20

📥 Commits

Reviewing files that changed from the base of the PR and between f3a4720 and db182a0.

📒 Files selected for processing (1)
  • ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__aws-5.0-nightly-x86.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__aws-5.0-nightly-x86.yaml

Walkthrough

This PR adds two new CUDN churn performance test jobs to the OpenShift perfscale CI suite, updating the benchmarking script to use a forked e2e-benchmarking repository pinned to a specific branch, and configuring the new test workloads to run pod and CUDN churn scenarios on 24-node AWS clusters with 250-iteration overrides.

Changes

CUDN Churn Testing Jobs

Layer / File(s) Summary
Script repository fork and branch pinning
ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh
The script now clones e2e-benchmarking from https://github.com/mohit-sheth/e2e-benchmarking and pins the build to the use-fork-kube-burner-ocp branch, replacing the previous dynamic tag-selection logic.
Pod and CUDN churn test job definitions
ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__aws-5.0-nightly-x86.yaml
Adds cudn-pod-churn-250-24nodes job scheduled at 06:00 UTC with pod churn parameters (churn-duration=30m, churn-percent=50, churn-delay=1m); adds cudn-churn-250-24nodes job scheduled at 08:00 UTC with CUDN churn parameters (churn-target=cudns, churn-percent=10, churn-cycles=3, churn-duration=10m). Both run 250 iterations on 24-node aws-perfscale-qe clusters with m6a.2xlarge compute and r5.2xlarge infra nodes.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Suggested labels

rehearsals-ack

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically describes the main change: adding two new CUDN churn periodic tests for the 5.0 release, which aligns with the core objective of the pull request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR adds CI job configs and shell wrapper scripts only; no Ginkgo test definitions present. Check is not applicable to CI configuration files.
Test Structure And Quality ✅ Passed PR adds CI/CD configurations and shell scripts, not Ginkgo tests. Check requiring Ginkgo test code review is not applicable.
Microshift Test Compatibility ✅ Passed This PR does not add any new Ginkgo e2e tests. It only adds CI configuration YAML files and shell script orchestration for existing benchmarking tests, making the check inapplicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR does not add new Ginkgo e2e tests; only adds CI job configurations and modifies shell scripts. The SNO compatibility check is not applicable to this change.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds CI test job configurations and test scripts only. Contains no deployment manifests, operator code, or controllers with scheduling constraints that could impact non-standard topologies.
Ote Binary Stdout Contract ✅ Passed PR only modifies CI job configuration YAML and a shell benchmarking script. No OTE binaries, Go test code, or stdout JSON contracts are present or modified.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR adds CI job configurations and modifies a shell script, not Ginkgo e2e tests. No new It(), Describe(), Context(), or When() test blocks are introduced. The check only applies to new Ginkgo...
No-Weak-Crypto ✅ Passed PR adds CI/CD test configuration for CUDN churn tests. Examined both modified files (YAML config and bash script) - no weak cryptography (MD5/SHA1/DES/RC4/3DES/Blowfish/ECB), custom crypto, or unsa...
Container-Privileges ✅ Passed The PR adds test job configurations and shell script changes without any privileged container settings, hostPID, hostNetwork, hostIPC, SYS_ADMIN capabilities, allowPrivilegeEscalation, or running a...
No-Sensitive-Data-In-Logs ✅ Passed PR does not introduce new sensitive data logging. The existing credential exposure pattern (set -x with ES_PASSWORD/ES_USERNAME) predates this change and is already in the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mohit-sheth
Once this PR has been reviewed and has the lgtm label, please assign jtaleric for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from krishvoor and rpattath June 12, 2026 22:01
@mohit-sheth

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-pod-churn-250-24nodes
/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: requesting more than one rehearsal in one comment is not supported. If you would like to rehearse multiple specific jobs, please separate the job names by a space in a single command.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh`:
- Around line 38-39: Hardcoded temporary fork settings (REPO_URL and TAG_OPTION)
lack a revert tracker; update the file to add a TODO comment adjacent to the
REPO_URL and TAG_OPTION declarations that includes the upstream PR/issue URL (or
CI ticket) and a short "revert when merged" note, or reference a scheduled
task/milestone to perform the revert—mention the exact upstream PR number/link
and the intended revert action so future maintainers can find and remove the
fork override.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 532e70f5-63f4-4e29-97ac-bb105a1eb576

📥 Commits

Reviewing files that changed from the base of the PR and between 85bbd08 and cc8a5ad.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (2)
  • ci-operator/config/openshift-eng/ocp-qe-perfscale-ci/openshift-eng-ocp-qe-perfscale-ci-main__aws-5.0-nightly-x86.yaml
  • ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh

Comment on lines +38 to +39
REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
TAG_OPTION="--branch use-fork-kube-burner-ocp";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚖️ Poor tradeoff

Temporary fork URL lacks tracking mechanism for revert.

Lines 38-39 hardcode the fork URL and branch as explicitly temporary (per PR description: "Revert e2e-benchmarking fork URL after upstream kube-burner-ocp PR merges"). Without a TODO comment, issue reference, or other tracking, this revert could be forgotten, leaving the repository diverged from upstream.

Recommendation: Add a TODO comment with the upstream PR reference or GitHub issue link to track the revert, or ensure the revert is tied to a scheduled task/milestone.

📝 Suggested fix to add tracking
+# TODO: Revert to cloud-bulldozer after upstream kube-burner-ocp PR merges
+# See: https://github.com/cloud-bulldozer/kube-burner-ocp/pull/458
 REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
 TAG_OPTION="--branch use-fork-kube-burner-ocp";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
TAG_OPTION="--branch use-fork-kube-burner-ocp";
# TODO: Revert to cloud-bulldozer after upstream kube-burner-ocp PR merges
# See: https://github.com/cloud-bulldozer/kube-burner-ocp/pull/458
REPO_URL="https://github.com/mohit-sheth/e2e-benchmarking";
TAG_OPTION="--branch use-fork-kube-burner-ocp";
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/openshift-qe/cudn-density/openshift-qe-cudn-density-commands.sh`
around lines 38 - 39, Hardcoded temporary fork settings (REPO_URL and
TAG_OPTION) lack a revert tracker; update the file to add a TODO comment
adjacent to the REPO_URL and TAG_OPTION declarations that includes the upstream
PR/issue URL (or CI ticket) and a short "revert when merged" note, or reference
a scheduled task/milestone to perform the revert—mention the exact upstream PR
number/link and the intended revert action so future maintainers can find and
remove the fork override.

@mohit-sheth

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@mohit-sheth

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-pod-churn-250-24nodes pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: job(s): pj-rehearse either don't exist or were not found to be affected, and cannot be rehearsed

@mohit-sheth

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-pod-churn-250-24nodes pj-rehearse periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

couldn't prepare candidate: couldn't rebase candidate onto 3736154f97498eca8df47621e296678262d67b5f due to conflicts
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@mohit-sheth, pj-rehearse: unable prepare a candidate for rehearsal; rehearsals will not be run. This could be due to a branch that needs to be rebased. ERROR:

couldn't rebase candidate onto 3736154f97498eca8df47621e296678262d67b5f due to conflicts

Add two new 24-node periodic tests to validate kube-burner-ocp
CUDN churn support (kube-burner/kube-burner-ocp#458):
- cudn-pod-churn-250-24nodes: 50% pod churn with 30m duration
- cudn-churn-250-24nodes: 10% CUDN group churn with 3 cycles

Temporarily points e2e-benchmarking to mohit-sheth fork with
use-fork-kube-burner-ocp branch for pj-rehearse validation.

Signed-off-by: Mohit Sheth <msheth@redhat.com>
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@mohit-sheth: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-4.21-nightly-node-density-heavy-ibmcloud-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-churn-250-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-pod-churn-250-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.22-nightly-x86-cudn-density-multi-ns-500-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-4.22-nightly-node-density-heavy-ibmcloud-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-density-multi-ns-500-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-5.0-nightly-x86-cudn-density-single-ns-250-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-4.21-nightly-control-plane-ibmcloud-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-4.22-nightly-control-plane-ibmcloud-24nodes N/A periodic Periodic changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.23-nightly-x86-cudn-density-multi-ns-500-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.23-nightly-x86-cudn-density-single-ns-250-24nodes N/A periodic Registry content changed
periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.22-nightly-x86-cudn-density-single-ns-250-24nodes N/A periodic Registry content changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@mohit-sheth: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant