RHINENG-27056: move db-migration to a separate job by TenSt · Pull Request #2236 · RedHatInsights/patchman-engine

TenSt · 2026-06-18T13:31:38Z

Context

Manager had db-migration as an init container. With multiple replicas and rolling updates, several new pods could run migration at once and compete for pg_advisory_lock(123). ClowdApp cannot set maxSurge on manager (public web service enabled); a dedicated Job avoids per-replica migration. Also, it is harder to troubleshoot when you have 2 pods trying to do migration and you don't know which one actually does it.

This PR:

Moves DB migration from manager init container to a single ClowdApp Job to avoid concurrent migrators during rollout
Replaces manager db-migration init with check-for-db (same pattern as listener/evaluator)
Adds db-migration Job
Updates entrypoint.sh to retry migrate on failure

Summary of the new flow:

New deploy → migration Job and new pods start in parallel → new pods block in init until DB schema is upgraded → success = rollout completes, failure = new pods fail init and old pods keep serving.

Summary by Sourcery

Move database migration from the manager init container to a dedicated ClowdApp Job and make migrations more resilient to failures.

New Features:

Introduce a dedicated db-migration Job to run schema migrations independently of manager pods.

Enhancements:

Replace the manager db-migration init container with a lightweight check-for-db init container that only waits for the upgraded schema.
Add configurable migration timeout and retry logic for the migration entrypoint to improve robustness during rollouts.

Deployment:

Configure a MIGRATION_TIMEOUT parameter to control the db-migration Job execution time.

sourcery-ai · 2026-06-18T13:31:45Z

Reviewer's Guide

Moves database migrations from the manager pod’s init container into a dedicated ClowdApp Job with retry logic, and converts the manager init container into a lightweight DB-schema readiness check so that new pods block until migrations complete successfully.

File-Level Changes

Change	Details	Files
Replace manager init-container migration with a DB readiness check script so manager pods only start once the schema is upgraded.	Rename the manager init container from db-migration to check-for-db and point its command to ./database_admin/check-upgraded.sh instead of entrypoint.sh Remove database admin–specific environment variables from the manager init container, leaving only POD_CONFIG	`deploy/clowdapp.yaml`
Introduce a dedicated one-shot db-migration Job to run schema migrations centrally during deployments.	Add a ClowdApp Job named db-migration with completions=1 and parallelism=1 using the same database_admin image and entrypoint as before Configure the migration Job with MIGRATION_MAX_RETRIES, database-related environment variables, and resource requests/limits appropriate for database admin work Add a MIGRATION_TIMEOUT ClowdApp parameter and wire it to the Job’s activeDeadlineSeconds	`deploy/clowdapp.yaml`
Add retry behavior to the migration entrypoint to make migrations more robust and controllable from the Job.	Stop using `set -e` so the script can implement custom retry logic while still failing the Job on repeated errors Introduce MIGRATION_MAX_RETRIES (default 1) and loop over migration attempts with a short sleep between failures Exit successfully on the first successful migration run and exit with failure after the last failed attempt	`database_admin/entrypoint.sh`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

Consider making MIGRATION_MAX_RETRIES configurable via a ClowdApp parameter (similar to MIGRATION_TIMEOUT) instead of hardcoding '3' in the Job spec so the retry behavior can be tuned per environment without code changes.
It may be worth double-checking that MIGRATION_TIMEOUT is aligned with MIGRATION_MAX_RETRIES and the 5s sleep in entrypoint.sh so the Job’s activeDeadlineSeconds cannot expire mid-retry loop in typical failure scenarios.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider making `MIGRATION_MAX_RETRIES` configurable via a ClowdApp parameter (similar to `MIGRATION_TIMEOUT`) instead of hardcoding `'3'` in the Job spec so the retry behavior can be tuned per environment without code changes.
- It may be worth double-checking that `MIGRATION_TIMEOUT` is aligned with `MIGRATION_MAX_RETRIES` and the 5s sleep in `entrypoint.sh` so the Job’s `activeDeadlineSeconds` cannot expire mid-retry loop in typical failure scenarios.

## Individual Comments

### Comment 1
<location path="deploy/clowdapp.yaml" line_range="313" />
<code_context>
+        command:
+          - ./database_admin/entrypoint.sh
+        env:
+        - {name: MIGRATION_MAX_RETRIES, value: '3'}
+        - {name: LOG_LEVEL, value: '${LOG_LEVEL_DATABASE_ADMIN}'}
+        - {name: DB_DEBUG, value: '${DB_DEBUG_DATABASE_ADMIN}'}
</code_context>
<issue_to_address>
**suggestion:** Consider making MIGRATION_MAX_RETRIES configurable via a template parameter instead of hardcoding '3'.

Align this with how `MIGRATION_TIMEOUT` and other resource values are configured, so the retry policy can be tuned per environment without editing the manifest directly.

Suggested implementation:

```
      podSpec:
        image: ${IMAGE}:${IMAGE_TAG}
        command:
          - ./database_admin/entrypoint.sh
        env:
        - name: MIGRATION_MAX_RETRIES
          value: ${{MIGRATION_MAX_RETRIES}}

```

To fully implement the suggestion, you will also need to:
1. Declare a `MIGRATION_MAX_RETRIES` template parameter in this ClowdApp (or the surrounding template/Helm chart), following the same pattern used for `MIGRATION_TIMEOUT` (e.g., add it to the `parameters`/`envTemplate` section with a default value of `3`).
2. Ensure any environment-specific overlays (e.g., dev/stage/prod) can override `MIGRATION_MAX_RETRIES` so it can be tuned per environment without editing this manifest directly.
</issue_to_address>

### Comment 2
<location path="database_admin/entrypoint.sh" line_range="3" />
<code_context>
 #!/bin/bash

-set -e -o pipefail # stop on error
+set -o pipefail

 MIGRATION_FILES=file://./database_admin/migrations
</code_context>
<issue_to_address>
**issue (bug_risk):** Dropping `set -e` may mask failures in future commands before the migrate loop.

The retry loop correctly handles `migrate` failures, but without `set -e` any other command before or between attempts can fail silently and the script will keep running. To preserve fail-fast behavior while still allowing retries, consider restoring `set -e` and isolating the retry logic around `migrate` (for example, by wrapping `migrate` in a function whose exit code you control).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-06-18T13:32:49Z

+        command:
+          - ./database_admin/entrypoint.sh
+        env:
+        - {name: MIGRATION_MAX_RETRIES, value: '3'}


suggestion: Consider making MIGRATION_MAX_RETRIES configurable via a template parameter instead of hardcoding '3'.

Align this with how MIGRATION_TIMEOUT and other resource values are configured, so the retry policy can be tuned per environment without editing the manifest directly.

Suggested implementation:

podSpec: image: ${IMAGE}:${IMAGE_TAG} command: - ./database_admin/entrypoint.sh env: - name: MIGRATION_MAX_RETRIES value: ${{MIGRATION_MAX_RETRIES}}

To fully implement the suggestion, you will also need to:

Declare a MIGRATION_MAX_RETRIES template parameter in this ClowdApp (or the surrounding template/Helm chart), following the same pattern used for MIGRATION_TIMEOUT (e.g., add it to the parameters/envTemplate section with a default value of 3).

Ensure any environment-specific overlays (e.g., dev/stage/prod) can override MIGRATION_MAX_RETRIES so it can be tuned per environment without editing this manifest directly.

sourcery-ai · 2026-06-18T13:32:49Z

@@ -1,8 +1,20 @@
 #!/bin/bash

-set -e -o pipefail # stop on error


issue (bug_risk): Dropping set -e may mask failures in future commands before the migrate loop.

The retry loop correctly handles migrate failures, but without set -e any other command before or between attempts can fail silently and the script will keep running. To preserve fail-fast behavior while still allowing retries, consider restoring set -e and isolating the retry logic around migrate (for example, by wrapping migrate in a function whose exit code you control).

codecov-commenter · 2026-06-18T13:37:18Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.04%. Comparing base (704b877) to head (9c55f7c).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2236      +/-   ##
==========================================
- Coverage   59.06%   59.04%   -0.03%     
==========================================
  Files         138      138              
  Lines        8848     8848              
==========================================
- Hits         5226     5224       -2     
- Misses       3076     3078       +2     
  Partials      546      546

Flag	Coverage Δ
unittests	`59.04% <ø> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

RHINENG-27056: move db-migration to a separate job

9c55f7c

TenSt requested a review from a team as a code owner June 18, 2026 13:31

sourcery-ai Bot reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RHINENG-27056: move db-migration to a separate job#2236

RHINENG-27056: move db-migration to a separate job#2236
TenSt wants to merge 1 commit into
RedHatInsights:masterfrom
TenSt:stepan/RHINENG-27056-move-db-migration-to-a-job

TenSt commented Jun 18, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Jun 18, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Jun 18, 2026

Uh oh!

sourcery-ai Bot Jun 18, 2026

Uh oh!

codecov-commenter commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -1,8 +1,20 @@
		#!/bin/bash

		set -e -o pipefail # stop on error

Conversation

TenSt commented Jun 18, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

This PR:

Summary of the new flow:

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jun 18, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TenSt commented Jun 18, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Jun 18, 2026 •

edited

Loading