Skip to content

HYPERFLEET-907 - docs: add condition mapping design#141

Open
ldornele wants to merge 2 commits into
openshift-hyperfleet:mainfrom
ldornele:HYPERFLEET-907
Open

HYPERFLEET-907 - docs: add condition mapping design#141
ldornele wants to merge 2 commits into
openshift-hyperfleet:mainfrom
ldornele:HYPERFLEET-907

Conversation

@ldornele
Copy link
Copy Markdown
Contributor

@ldornele ldornele commented May 21, 2026

Summary

Technical specification for CEL-based condition mapping mechanism to expose adapter conditions in API resource status.conditions. Deliverable for
HYPERFLEET-907 spike story, optimized for spec-driven development in next sprint.

What Changed

New Document

  • condition-mapping-design.md — Full technical specification
    • Configuration schema (1-to-1 transformation and N-to-1 merge patterns)
    • CEL evaluation context and security considerations
    • Trade-offs, alternatives considered, and implementation examples
    • Follows hard-delete-design.md pattern: references architecture docs instead of duplicating implementation details

Key Design Decisions

Configuration Schema

  • 1-to-1 transformation: Single adapter condition → single resource condition with CEL transforms
  • N-to-1 merge: Multiple adapter conditions → single merged resource condition with aggregation logic
  • 3 CEL expressions for merges: status_expression, reason_expression, message_expression

Security & Performance

  • Per-expression timeout: 100ms
  • Aggregate timeout: 1 second (all rules)
  • Max mapped conditions: 50 per resource
  • Execution overhead: ~2-3ms for 15 rules across 3 adapters

Field Naming

  • Uses snake_case per ADR-0007: observed_generation, last_transition_time
  • 6 standard condition fields exposed to CEL (excludes sensitive data field)

Validation Behavior

  • reason field: Condition skipped if oversized (machine-readable, used in CEL logic)
  • message field: Truncated to 2048 chars (human-readable, informational only)

Acceptance Criteria (HYPERFLEET-907)

  • ✅ Documented current status aggregation logic
  • ⚠️ Gathered requirements from GCP and ROSA teamsPending: Specification includes illustrative examples based on generic adapter patterns. Direct requirements validation with adapter teams needed
    before implementation sprint (Slack message sent to request input)
  • ✅ Evaluated 5 implementation approaches (CEL selected, 4 alternatives rejected)
  • ✅ Assessed configuration placement (API server YAML config)
  • ✅ Identified impact on aggregation flow (runs during PUT /statuses, 2-3ms overhead)
  • ✅ Produced design document with trade-offs and sample configuration schema
  • ✅ Defined acceptance criteria for implementation stories

Testing

  • Documents validated against:
    • ADR-0007 (Conditions-Based Status Model)
    • ADR-0008 (Dynamic Status Aggregation)
    • Existing CEL patterns in Sentinel and adapters
    • Error Model Standard and Logging Standard

Related Links

@openshift-ci openshift-ci Bot requested review from pnguyen44 and rh-amarin May 21, 2026 03:28
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mbrudnoy for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive design document for a CEL-based condition mapping engine that integrates with the existing PUT /statuses aggregation flow. The design establishes terminology for Resource vs Adapter conditions, specifies filtering behavior for Unknown adapter statuses, and defines configuration patterns for both 1-to-1 transformations and N-to-1 merges. It documents rule evaluation order, conflict resolution, CEL runtime context with field restrictions, integration points, error handling with partial-result support, and multiple security safeguards including compilation limits, sandboxing, timeouts, and audit logging. The document includes illustrative YAML examples, explicit trade-offs, alternatives analysis, and references to related standards and ADRs.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant APIService
  participant AggregateResourceStatus
  participant CEL
  participant ResourceStatusStore
  Client->>APIService: PUT /statuses (resource status payload)
  APIService->>AggregateResourceStatus: invoke aggregation pipeline
  AggregateResourceStatus->>CEL: evaluate mapping rules for adapter conditions
  CEL-->>AggregateResourceStatus: mapped conditions (1:1 or merged N:1)
  AggregateResourceStatus->>ResourceStatusStore: persist aggregated resource conditions
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding a condition mapping design document as part of the HYPERFLEET-907 work item.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description directly relates to the changeset: it clearly describes the addition of a CEL-based condition mapping design document and provides comprehensive context about its purpose, key design decisions, and acceptance criteria.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hyperfleet/components/api-service/condition-mapping-design.md`:
- Around line 36-40: Add an "Implementation Coordination" (or "Cross-Repository
Impact") subsection to this document that explicitly lists affected repositories
(Sentinel, Adapter framework), calls out coordination points such as Sentinel's
condition() helper and client filters and Adapter CEL task configs that read
status.conditions (including the effect of the data field exclusion and new
mapped conditions), and prescribes validation steps (review CEL expressions for
assumptions about condition cardinality/fields, run existing Sentinel and
Adapter unit/integration tests, and update configs/tests where mapped conditions
change shape). Ensure the subsection references the specific concerns around the
data field exclusion and mapped-condition aggregation so implementers know what
to validate and whom to coordinate with.
- Line 7: The doc lacks the explicit Context/Decision/Consequences structure
required by the hard-delete-design.md precedent: update
condition-mapping-design.md by either (a) adding top-level headers "## Context",
"## Decision", "## Consequences" and move the existing "What & Why" content into
Context, "How" into Decision, and "Trade-offs" into Consequences, or (b) add a
clear mapping note at the top explaining these equivalences; also add a
"Transaction Guarantees" subsection under the Decision/How section that
centrally documents the transaction/atomicity and ordering rationale referenced
around lines 135 and 218, and explicitly state ownership and
transaction/ordering responsibility within the Decision section so reviewers can
find it easily.
- Line 267: Update the "Allowlist Approach" sentence to say "six standard
condition fields" instead of "five" and add the missing field name
`observed_generation` to the enumerated list so it reads: type, status, reason,
message, observed_generation, last_transition_time (reflecting ADR-0007 and the
Terminology/CEL context references).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 5459bc77-8f4d-41a9-a859-d9c6ec581b0d

📥 Commits

Reviewing files that changed from the base of the PR and between 5a1b8ed and 2abe6ad.

📒 Files selected for processing (1)
  • hyperfleet/components/api-service/condition-mapping-design.md

Last Updated: 2026-05-20
---

# Condition Mapping Design
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider restructuring to match the hard-delete-design.md precedent.

The document provides comprehensive information but doesn't follow the Context/Decision/Consequences structure established by the referenced hard-delete-design.md precedent. While "What & Why" maps to Context, "How" to Decision, and "Trade-offs" to Consequences, the headers don't make this explicit.

As per coding guidelines: "Use this document as the 'design doc shape' precedent referenced by the PR: clearly state Context/Decision/Consequences and explicitly document the ownership and transaction/ordering rationale."

Consider either:

  1. Adding explicit "## Context", "## Decision", "## Consequences" top-level headers with current content mapped appropriately
  2. Or adding a note at the top explaining how current sections map to the standard structure

The transaction/atomicity rationale (lines 135, 218) is mentioned but should be centrally documented in a "Transaction Guarantees" subsection under Decision/How.

Also applies to: 23-23, 360-360

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hyperfleet/components/api-service/condition-mapping-design.md` at line 7, The
doc lacks the explicit Context/Decision/Consequences structure required by the
hard-delete-design.md precedent: update condition-mapping-design.md by either
(a) adding top-level headers "## Context", "## Decision", "## Consequences" and
move the existing "What & Why" content into Context, "How" into Decision, and
"Trade-offs" into Consequences, or (b) add a clear mapping note at the top
explaining these equivalences; also add a "Transaction Guarantees" subsection
under the Decision/How section that centrally documents the
transaction/atomicity and ordering rationale referenced around lines 135 and
218, and explicitly state ownership and transaction/ordering responsibility
within the Decision section so reviewers can find it easily.

Comment on lines +36 to +40
**Related Documentation:**
- **Current Status Aggregation**: [ADR-0008 — Dynamic Status Aggregation](../../adrs/0008-dynamic-status-aggregation.md) — aggregation computed on write path; [ADR-0007 — Conditions-Based Status Model](../../adrs/0007-conditions-based-status-model.md) — ResourceCondition and AdapterCondition contracts; [Status Guide](../../docs/status-guide.md) — condition reporting and validation rules
- [API Service Design](./api-service.md) — API architecture and service layer patterns
- [Sentinel Message Decision Config](../sentinel/sentinel.md) — Existing CEL usage in Sentinel
- [Adapter Framework Design](../adapter/framework/adapter-frame-design.md) — Existing CEL usage in adapters (Config Loader and Criteria Evaluator sections)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Consider adding cross-repo coordination guidance.

The linked repositories research shows that Sentinel and Adapter repos consume resource.status.conditions via CEL and have existing configs/tests that depend on condition shape and field availability. Specifically:

  • Sentinel's condition() helper and client filters rely on current condition structure
  • Adapter CEL configs in task definitions read status.conditions
  • Both repos may be affected by the data field exclusion (line 204) and new mapped conditions appearing in aggregation results

While the References section lists related documentation, consider adding an explicit "Implementation Coordination" or "Cross-Repository Impact" subsection that:

  1. Lists affected repositories (Sentinel, Adapter framework)
  2. Identifies specific coordination points (e.g., validating Sentinel decision configs still work with new mapped conditions)
  3. Recommends validation steps (e.g., reviewing existing CEL expressions for assumptions about condition cardinality)

This helps implementers anticipate integration work and aligns with the hard-delete-design.md precedent of explicitly documenting implementation dependencies.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hyperfleet/components/api-service/condition-mapping-design.md` around lines
36 - 40, Add an "Implementation Coordination" (or "Cross-Repository Impact")
subsection to this document that explicitly lists affected repositories
(Sentinel, Adapter framework), calls out coordination points such as Sentinel's
condition() helper and client filters and Adapter CEL task configs that read
status.conditions (including the effect of the data field exclusion and new
mapped conditions), and prescribes validation steps (review CEL expressions for
assumptions about condition cardinality/fields, run existing Sentinel and
Adapter unit/integration tests, and update configs/tests where mapped conditions
change shape). Ensure the subsection references the specific concerns around the
data field exclusion and mapped-condition aggregation so implementers know what
to validate and whom to coordinate with.

Comment thread hyperfleet/components/api-service/condition-mapping-design.md Outdated
@ldornele ldornele marked this pull request as draft May 21, 2026 12:36
@ldornele ldornele marked this pull request as ready for review May 21, 2026 19:06
@openshift-ci openshift-ci Bot requested review from aredenba-rh and vkareh May 21, 2026 19:06
source_condition_filter:
expression: |
adapterCondition.type.startsWith("GCP")
aggregation_logic:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The N-to-1 merge aggregation_logic specifies status_expression, reason_expression, and message_expression, but doesn't define how observed_generation and last_transition_time are determined for the merged output.

Example 2's output shows "last_transition_time": "2026-05-19T10:32:00Z" which doesn't match any source condition (10:25, 10:30, 10:20) — it's unclear whether this is the current time, the latest source time, or a bug in the example.

Consider either:

  1. Adding observed_generation_expression and last_transition_time_expression to the aggregation_logic schema for full control
  2. Or documenting the default behavior explicitly (e.g., "observed_generation defaults to resourceGeneration; last_transition_time defaults to the latest last_transition_time among source conditions")

Either way, the Example 2 output should be consistent with the documented behavior.

source_condition_filter:
expression: |
adapterCondition.type in ["ControlPlaneReady", "WorkerNodesReady"]
transformations:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two implicit behaviors in the 1-to-1 transformation schema that should be documented explicitly:

  1. Field copy default: The transformations array only specifies type and message, but Example 1 shows status, reason, observed_generation, and last_transition_time copied as-is from the source. Please state explicitly that untransformed fields are copied from the source adapter condition.

  2. Per-match cardinality: This rule's filter matches two condition types (ControlPlaneReady and WorkerNodesReady). Does the rule produce one output condition per match (i.e., 2 resource conditions)? The adapterCondition singular variable name suggests per-match execution, but this should be stated explicitly — especially since N-to-1 merge exists as a separate pattern.

Suggested addition under Configuration Schema:

"For 1-to-1 rules, the transformation runs once per matching adapter condition, producing one resource condition per match. Fields not listed in transformations are copied as-is from the source adapter condition."


| Term | Definition |
|------|-----------|
| **Resource Condition** | Kubernetes-style condition in `status.conditions` array on Cluster/NodePool resources. Status is `True` or `False` only. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Resource Condition definition states "Status is True or False only", but ADR-0007 (referenced by this document) specifies that the initial state after creation is Available=Unknown, Ready=False. ADR-0008 also mentions Available=Unknown for first adapter reports.

The Unknown filtering rationale for mapped conditions is sound, but this terminology entry overgeneralizes — existing aggregated conditions (Available) can be Unknown transiently.

Consider qualifying to:

"Status is True or False in steady state. Unknown may appear transiently (e.g., Available=Unknown at resource creation per ADR-0007). Mapped conditions enforce the True/False-only contract — see Unknown Filtering."

This prevents implementers from mistaking existing Unknown states as bugs.

6. Marshal all conditions to JSON
7. Update resource.status_conditions

Mapped conditions are included in the same database transaction as the adapter status update via the existing transaction-per-request middleware (see `hyperfleet-api/pkg/db/CLAUDE.md`).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This references hyperfleet-api/pkg/db/CLAUDE.md as documentation for the transaction middleware. CLAUDE.md files are developer tooling instructions, not architecture documentation — they change frequently and aren't reviewed with the same rigor.

Consider referencing the actual middleware source file instead, or describing the behavior inline (e.g., "via the existing transaction-per-request middleware in pkg/db").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants