Skip to content

CP-42614: Add monitoring discovery model (enabled + discovery.method)#840

Open
evan-cz wants to merge 1 commit into
developfrom
CP-42614/monitoring-discovery
Open

CP-42614: Add monitoring discovery model (enabled + discovery.method)#840
evan-cz wants to merge 1 commit into
developfrom
CP-42614/monitoring-discovery

Conversation

@evan-cz

@evan-cz evan-cz commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Finalizes the chart's monitoring integration into one explicit, opt-in model. The chart could advertise the agent's metrics two ways at once — prometheus.io/* annotations and ServiceMonitor CRDs — which scrape the same targets twice. Now components.monitoring.enabled turns monitoring on, and components.monitoring.discovery.method selects a single discovery mechanism.

What changed

  • components.monitoring.enabled — plain boolean, default false. Monitoring is off until you turn it on. No cluster auto-detection.
  • components.monitoring.discovery.method (auto | serviceMonitors | annotations, default auto) — when enabled, the discovery mechanism:
    • auto resolves to serviceMonitors today (left as an enum so a future mechanism can extend it).
    • serviceMonitorsServiceMonitor resources + the PrometheusRule alert bundle. These are monitoring.coreos.com/v1 CRDs, so the install fails if the Prometheus Operator is absent.
    • annotationsprometheus.io/* annotations only; no Operator CRDs, no alert rules.
  • ServiceMonitors + PrometheusRule are one bundle (render together); annotations is discovery-only. For both at once, add the annotations via defaults.annotations.
  • values.schema.yaml/.json, values.yaml, and helm/docs/monitoring-infrastructure.md updated; the doc was rewritten as user-facing reference.

Configuration changes

This finalizes the still-in-validation monitoring integration (CP-34935), so its defaults change:

  • The default install now emits nothing monitoring-related (previously the operator CRDs were off by default but prometheus.io/* annotations were emitted unconditionally).
  • enabled: true now selects a single mechanism via discovery.method (default serviceMonitors), rather than emitting annotations and ServiceMonitors together.

To get annotation-based discovery: monitoring.enabled: true + monitoring.discovery.method: annotations.

Testing

  • helm lint passing
  • helm unit tests passing (the monitoring and defaults-service suites were rewritten for the new model)
  • helm schema tests passing, including new fixtures for the default / serviceMonitors / annotations / none / invalid-value cases; kubeconform validates the rendered ServiceMonitor and PrometheusRule against the CRD catalog
  • golden manifests regenerated; a default helm template emits no ServiceMonitors, no PrometheusRule, and no prometheus.io/* annotations
  • helm subchart tests passing

@evan-cz evan-cz requested a review from a team as a code owner June 10, 2026 13:13
@evan-cz evan-cz force-pushed the CP-42614/monitoring-discovery branch 2 times, most recently from 9a9eecf to e4ce503 Compare June 10, 2026 15:58
The chart's monitoring integration could advertise the agent's metrics two ways
at once -- prometheus.io/* annotations and ServiceMonitor CRDs -- which scrape the
same targets twice. This finalizes the still-in-validation monitoring
configuration into an explicit, opt-in model.

Implementation Approach:

The chart emits two kinds of monitoring resources -- Prometheus Operator CRDs
(ServiceMonitor + PrometheusRule) and prometheus.io/* annotations on the agent
Services. Both are driven by two values resolved in helm/templates/_helpers.tpl
and consumed by the Service, ServiceMonitor, and PrometheusRule templates. There
is no cluster auto-detection; the configuration is explicit. This finalizes the
still-in-validation monitoring integration added in CP-34935, so its defaults
change: the default install emits nothing, and enabled:true selects one mechanism
via discovery.method rather than emitting annotations and ServiceMonitors together.

Functional Requirements:

1. Monitoring must be off by default and turned on explicitly.

   Added components.monitoring.enabled as a plain boolean defaulting to false.
   The four servicemonitor-*.yaml templates, prometheusrule.yaml, and the
   prometheus.io/* annotations on the three Service templates are all gated on it
   via helpers in _helpers.tpl.

2. When enabled, the operator must be able to choose the discovery mechanism.

   Added components.monitoring.discovery.method (auto | serviceMonitors |
   annotations, default auto). auto resolves to serviceMonitors today and is left
   as an enum so a future mechanism can extend it; annotations is an explicit
   opt-in. serviceMonitors emits monitoring.coreos.com/v1 CRDs, so the install
   fails if the Prometheus Operator is absent; annotations needs no CRDs.

3. The ServiceMonitors and the PrometheusRule must form a single bundle;
   annotation-based discovery is discovery-only.

   serviceMonitorsActive and rulesActive resolve together on the serviceMonitors
   path, so method=annotations yields no ServiceMonitors and no alert rules.

4. The new values must validate and be documented for users.

   Updated helm/values.schema.yaml (enabled: boolean; discovery.method enum) and
   regenerated helm/values.schema.json; set defaults and comments in
   helm/values.yaml; rewrote helm/docs/monitoring-infrastructure.md as user-facing
   reference.

Validation:

- helm lint passing
- helm unit tests passing; helm/tests/monitoring_integration_test.yaml and
  defaults_service_test.yaml were rewritten for the new model
- helm schema tests passing, including new fixtures under tests/helm/schema for
  the default, serviceMonitors, annotations, none, and invalid-value cases.
  kubeconform validates the rendered ServiceMonitor and PrometheusRule against the
  CRD catalog
- golden manifests regenerated; a default `helm template` emits no ServiceMonitors,
  no PrometheusRule, and no prometheus.io/* annotations
- helm subchart tests passing

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants