Skip to content

CH-262 / CH-263 / CH-264 - Tilt and some other minor changes#847

Open
ddelpiano wants to merge 15 commits intodevelopfrom
feature/ifns-29
Open

CH-262 / CH-263 / CH-264 - Tilt and some other minor changes#847
ddelpiano wants to merge 15 commits intodevelopfrom
feature/ifns-29

Conversation

@ddelpiano
Copy link
Copy Markdown
Member

@ddelpiano ddelpiano commented Apr 16, 2026

Implemented solution

  • tilt implementation ported from Zoran's branch, the other branch was in a messy state due to a forced push
  • some changes required for mongo and kc due to the new version used
  • cluster ip detection logic updated for local development

How to test this PR

harness-deployment and check your application will deploy fine, you can now use 'tilt up' to bring all services up and monitor them from the tilt gui.

Sanity checks:

  • The pull request is explicitly linked to the relevant issue(s)
  • The issue is well described: clearly states the problem and the general proposed solution(s)
  • In this PR it is explicitly stated how to test the current change
  • The labels in the issue set the scope and the type of issue (bug, feature, etc.)
  • The relevant components are indicated in the issue (if any)
  • All the automated test checks are passing
  • All the linked issues are included in one Sprint
  • All the linked issues are in the Review state
  • All the linked issues are assigned

Breaking changes (select one):

  • The present changes do not change the preexisting api in any way
  • This PR and the issue are tagged as a breaking-change and the migration procedure is well described above

Possible deployment updates issues (select one):

  • There is no reason why deployments based on CloudHarness may break after the current update
  • This PR and the issue are tagged as alert:deployment

Test coverage (select one):

  • Tests for the relevant cases are included in this pr
  • The changes included in this pr are out of the current test coverage scope

Documentation (select one):

  • The documentation has been updated to match the current changes
  • The changes included in this PR are out of the current documentation scope

Nice to have (if relevant):

  • Screenshots of the changes
  • Explanatory video/animated gif

'before': [git_clone_hook(conf, context_path) for conf in helm_values.apps[app_key].harness.dependencies.git]
}

images = set()
tilt_template = env.get_template(template_name)
apps = helm_values.apps
artifacts = {}
overrides = {}
Comment thread tools/deployment-cli-tools/ch_cli_tools/tilt.py Fixed
@@ -0,0 +1,216 @@
import os
import logging
@@ -0,0 +1,216 @@
import os
import logging
import json
import os
import logging
import json
import time
import json
import time

from os.path import join, relpath, basename, exists, abspath

from os.path import join, relpath, basename, exists, abspath
from jinja2 import Environment, PackageLoader, select_autoescape
from cloudharness_model import ApplicationTestConfig, HarnessMainConfig, GitDependencyConfig
Comment on lines +10 to +11
from cloudharness_utils.constants import APPS_PATH, DEPLOYMENT_CONFIGURATION_PATH, \
BASE_IMAGES_PATH, STATIC_IMAGES_PATH, HELM_ENGINE, COMPOSE_ENGINE

from cloudharness_utils.constants import APPS_PATH, DEPLOYMENT_CONFIGURATION_PATH, \
BASE_IMAGES_PATH, STATIC_IMAGES_PATH, HELM_ENGINE, COMPOSE_ENGINE
from .helm import KEY_APPS, KEY_HARNESS, KEY_DEPLOYMENT, KEY_TASK_IMAGES
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Tilt-based local dev deployment support and updates supporting deployment/runtime logic for newer Keycloak/Mongo and improved local cluster IP detection.

Changes:

  • Generate a Tiltfile during harness-deployment and add Tilt deployment extension/template support.
  • Update local cluster IP detection logic and propagate local=True in Helm/config generator paths.
  • Adjust Keycloak/Django event sync behavior and Mongo readiness/liveness probes for newer versions.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
tools/deployment-cli-tools/requirements.txt Adds jinja2 dependency for Tiltfile generation.
tools/deployment-cli-tools/harness-deployment Invokes Tiltfile generation as part of deployment generation.
tools/deployment-cli-tools/ch_cli_tools/utils.py Changes cluster IP detection logic and adds local-dev heuristics.
tools/deployment-cli-tools/ch_cli_tools/tilt.py New Tiltfile generator using Jinja templates.
tools/deployment-cli-tools/ch_cli_tools/templates/tilt/tilt-template.tpl Tiltfile template for builds, deploy, and UI helpers.
tools/deployment-cli-tools/ch_cli_tools/helm.py Uses updated cluster IP logic for local deployments.
tools/deployment-cli-tools/ch_cli_tools/configurationgenerator.py Uses updated cluster IP logic for hosts info output.
libraries/cloudharness-common/cloudharness/utils/env.py Updates events service DNS name casing.
libraries/cloudharness-common/cloudharness/auth/keycloak.py Ensures first/last name are preserved when updating user attributes.
infrastructure/common-images/cloudharness-django/.../services/events.py Adjusts listener init behavior and event handling timing/group id.
infrastructure/common-images/cloudharness-django/.../apps.py Starts event listener from Django AppConfig ready().
deployment-configuration/tilt-deploy.ext New Tilt extension to helm-render and patch deployments for dev workflows.
deployment-configuration/helm/templates/auto-database-mongo.yaml Uses mongosh for Mongo probes.
applications/accounts/deploy/templates/_components.tpl Updates Keycloak user profile config (adds stripe_uid attribute).

})

if not watch:
# don't watch mnp folder
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "mnp" should be "npm".

Suggested change
# don't watch mnp folder
# don't watch npm folder

Copilot uses AI. Check for mistakes.
Comment on lines 29 to 33
if resource in ["CLIENT_ROLE_MAPPING", "GROUP", "USER", "GROUP_MEMBERSHIP", "ORGANIZATION_MEMBERSHIP"]:
try:
time.sleep(1) # wait a bit to make sure the transaction is committed in Keycloak before trying to fetch the updated data
init_services()
user_service = get_user_service()
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.sleep(1) is executed for every handled Keycloak event, adding a fixed 1s latency and reducing throughput of the consumer thread. Consider replacing this with a bounded retry/backoff when the subsequent get_user/get_group fetch fails due to eventual consistency, so the common case doesn’t always pay the delay.

Copilot uses AI. Check for mistakes.
return
except Exception as e:
log.error(e)
raise e
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python, raise e resets the traceback context. Use a bare raise to preserve the original stack trace (and consider log.exception(...) if you want the traceback in logs).

Suggested change
raise e
raise

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +33
def ready(self):
# imports
import sys
for skip_cmd in [
"--help",
"collectstatic",
"compilemessages",
"compress",
"dbshell",
"dumpdata",
"loaddata",
"makemessages",
"makemigrations",
"migrate",
"reset_db",
"showmigrations",
"sqlmigrate",
"squashmigrations",
"test",
]:
# for these commands we skip initializing the event listener
if skip_cmd in sys.argv:
return

from cloudharness_django.services.events import init_listener_in_background
init_listener_in_background()
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting the Kafka listener in AppConfig.ready() can run multiple times in common Django setups (e.g. runserver autoreloader spawns a second process; multi-worker gunicorn/uWSGI will start one per worker). This can lead to duplicate consumers/extra connections or repeated retries. Add guards (e.g. check os.environ.get("RUN_MAIN") for runserver, or require an explicit env flag) to ensure it starts exactly once per intended process.

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +22
env = Environment(
loader=PackageLoader(package_name="ch_cli_tools", package_path="templates/tilt"),
autoescape=select_autoescape()
)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PackageLoader("ch_cli_tools", "templates/tilt") requires the templates/tilt files to be included in the installed package. Currently tools/deployment-cli-tools/setup.py has no package_data, and MANIFEST.in doesn’t include ch_cli_tools/templates/**, so the generated distribution is likely to miss tilt-template.tpl and fail at runtime. Either switch to loading the template from the filesystem (consistent with other generators) or update packaging to include these template files.

Copilot uses AI. Check for mistakes.
Comment on lines 215 to 219
try:
values['localIp'] = get_cluster_ip()
values['localIp'] = get_cluster_ip(local=True)
except subprocess.TimeoutExpired:
logging.warning("Minikube not available")
except:
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_cluster_ip(local=True) now swallows timeouts/errors internally and falls back to get_host_address(), so this except subprocess.TimeoutExpired branch is effectively dead code. Either let TimeoutExpired propagate from get_cluster_ip (so callers can warn appropriately) or simplify this try/except to match the new behavior.

Copilot uses AI. Check for mistakes.
parent_app_name=parent_app_name)
elif app_name in helm_values[KEY_TASK_IMAGES]:
process_build_dockerfile(dockerfile_path, root_path,
requirements=guess_build_dependencies_from_dockerfile(dockerfile_path), dockerfile_path=dockerfile_path, app_name=app_name)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call passes dockerfile_path both positionally and as a keyword argument, which will raise TypeError: got multiple values for argument 'dockerfile_path' at runtime. Remove the duplicate keyword argument (or rename the keyword if you intended a different parameter).

Suggested change
requirements=guess_build_dependencies_from_dockerfile(dockerfile_path), dockerfile_path=dockerfile_path, app_name=app_name)
requirements=guess_build_dependencies_from_dockerfile(dockerfile_path), app_name=app_name)

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +11
import os
import logging
import json
import time

from os.path import join, relpath, basename, exists, abspath
from jinja2 import Environment, PackageLoader, select_autoescape
from cloudharness_model import ApplicationTestConfig, HarnessMainConfig, GitDependencyConfig

from cloudharness_utils.constants import APPS_PATH, DEPLOYMENT_CONFIGURATION_PATH, \
BASE_IMAGES_PATH, STATIC_IMAGES_PATH, HELM_ENGINE, COMPOSE_ENGINE
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many unused imports in this new module (e.g. logging, json, time, exists, abspath, ApplicationTestConfig, DEPLOYMENT_CONFIGURATION_PATH, HELM_ENGINE, COMPOSE_ENGINE, plus several .utils imports). This will fail linting/type-checking in many setups and makes the file harder to maintain. Remove unused imports (and also unused locals like overrides, images/static_images/base_images, builds if they’re not needed).

Copilot uses AI. Check for mistakes.
if deployment_name in extra_env and len(extra_env[deployment_name]) > 0:
print("Adding tasks images dependencies to env ", deployment_name)
for env in extra_env[deployment_name]:
container["env"].append({
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

container["env"] may not exist on all rendered Deployments. This will raise a KeyError when adding task-image env vars. Use container.setdefault("env", []).append(...) (or initialize env if missing) before appending.

Suggested change
container["env"].append({
container.setdefault("env", []).append({

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +117
try:
out = subprocess.check_output([
'kubectl', '-n', 'ingress-nginx', 'get', 'svc', 'ingress-nginx-controller',
'-o', 'jsonpath={.status.loadBalancer.ingress[0].ip}'
], timeout=5).decode("utf-8").strip()
if out and out != '<no value>':
return out
except:
pass
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid bare except: blocks here. They will also swallow KeyboardInterrupt/SystemExit and make failures hard to diagnose. Catch the specific expected exceptions (e.g. subprocess.CalledProcessError, subprocess.TimeoutExpired, FileNotFoundError) and optionally log at debug level so local IP detection failures are observable.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@filippomc filippomc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't reference Jira issues appropriately in the commit messages, nor in the PR.
There was a link to a private Jira issue from an external project, which I removed.
Please reference issues in the Cloudharness Jira project in the PR (especially the Tilt feature is a big one). If an issue doesn't exist, let's create it.

  • Added a note about silent exception handling.
  • Linting check is not passing

response = requests.get(url, verify=False, timeout=5).json()
dsn = response.get('dsn')
except Exception:
return None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should at least log this error

@ddelpiano ddelpiano changed the title Updates due to new KC, mongo, local cluster ip detection logic CH-262 Apr 27, 2026
@ddelpiano ddelpiano changed the title CH-262 CH-262 - Tilt and some other minor changes Apr 27, 2026
@ddelpiano ddelpiano changed the title CH-262 - Tilt and some other minor changes CH-262 / CH-263 / CH-264 - Tilt and some other minor changes Apr 27, 2026
Comment thread tools/deployment-cli-tools/ch_cli_tools/tilt.py Fixed
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
for dockerfile_path in base_dockerfiles:
process_build_dockerfile(dockerfile_path, root_path, global_context=True)

static_images = set()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants