Skip to content

feat: add stream_validate() hook to Requirement (#900)#925

Draft
planetf1 wants to merge 14 commits intogenerative-computing:mainfrom
planetf1:feat/900-stream-validate
Draft

feat: add stream_validate() hook to Requirement (#900)#925
planetf1 wants to merge 14 commits intogenerative-computing:mainfrom
planetf1:feat/900-stream-validate

Conversation

@planetf1
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 commented Apr 24, 2026

⚠️ Stacked PR — read before reviewing

This is a stacked PR (Wave 2 of streaming epic #891). The diff against main includes code from the two Wave 1 PRs that this branch builds on:

PR Contents Status
#924 PartialValidationResult tri-state type Ready to merge
#923 ChunkingStrategy ABC + built-in chunkers Ready to merge
#925 (this PR) stream_validate() hook on Requirement Draft

Suggested review order:

  1. Review and approve feat(core): add PartialValidationResult with tri-state semantics #924 first — it is self-contained and has no dependencies
  2. Review and approve feat(stdlib): add ChunkingStrategy ABC and built-in chunkers #923 — also self-contained
  3. Return here and focus only on the two files new to this PR (below)

What to ignore in the diff: mellea/stdlib/chunking.py, test/stdlib/test_chunking.py, mellea/stdlib/__init__.py (covered by #923), and the PartialValidationResult class in mellea/core/requirement.py (covered by #924). Those files appear in this diff because the branch is stacked, not because this PR changes them.

What to focus on:

  • mellea/core/requirement.py — the stream_validate() method added to Requirement (one new method, ~10 lines)
  • test/core/test_stream_validate.py — 7 new unit tests

Summary

Adds async stream_validate(chunk, *, backend, ctx) -> PartialValidationResult to the base Requirement class as a per-chunk streaming validation hook. The default implementation returns PartialValidationResult("unknown") — subclasses override to inspect accumulated chunks and signal "pass" or "fail" early. The method is explicitly documented as non-mutating; state isolation is the orchestrator's responsibility.

Design decisions

Per the agreed Phase 1 spec:

No LLM-as-a-Judge logic is added here. This is a pure hook for custom validation overrides.

Test plan

  • test_default_returns_unknown — base class always returns "unknown"
  • test_default_returns_partial_validation_result_instance — correct return type
  • test_stream_validate_is_coroutine — method is async
  • test_subclass_can_return_pass — subclass override returning "pass" works
  • test_subclass_can_return_fail — subclass override returning "fail" with reason works
  • test_does_not_mutate_requirement — calling the method leaves self unchanged
  • test_stream_validate_idempotent — multiple calls leave self unchanged
  • Full cumulative test suite (Wave 1 + Wave 2): 52 passed

Related

Closes #900. Part of streaming epic #891. Stacks on #898 / #924 and #899 / #923.

Adds mellea/stdlib/chunking.py with ChunkingStrategy ABC and three
built-in implementations: SentenceChunker, WordChunker, ParagraphChunker.
split(accumulated_text) returns complete chunks, holding back trailing
fragments for the next call.

Closes generative-computing#899
Part of generative-computing#891

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Adds PartialValidationResult to mellea/core/requirement.py — a tri-state
result type for per-chunk streaming validation. Follows the same
private-fields-with-property-accessors pattern as ValidationResult.

success: Literal["pass", "fail", "unknown"]. __bool__ returns True for
"pass", False for "fail" and "unknown". "pass" is informational only in
Phase 1; orchestrators call validate() at stream end for all non-"fail"
results.

Closes generative-computing#898
Part of generative-computing#891

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…esult._success

Pyright infers self._success as str without the annotation because it doesn't
narrow Literal types through bare instance attribute assignment. The explicit
annotation makes the property return type verifiable.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Pyright flagged import as unaccessed; no pytest.* calls in the file.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…hunkingStrategy

- Fix SentenceChunker whitespace leak: lstrip() after match.end() so
  double-space / tab separators don't bleed into the next chunk as
  leading whitespace
- Add end-of-stream contract to ABC docstring (callers responsible for
  trailing fragment after stream terminates)
- Fix incorrect comment "end-of-string" → "whitespace"
- Compile _WHITESPACE / _PARA_BOUNDARY / _PARA_BOUNDARY_END at module
  level (consistent with _SENTENCE_BOUNDARY; avoids per-call recompile)
- Expand SentenceChunker char class to include right curly double/single
  quotes (U+201D / U+2019) for common LLM output patterns
- Document CRLF limitation on ParagraphChunker
- Re-export ChunkingStrategy + chunkers from mellea.stdlib.__init__
- Add __all__ to chunking.py
- Add tests: closing paren, double-space separator, tab separator,
  abbreviation edge case (known-bad split), WordChunker leading-whitespace

Assisted-by: Claude Code
…tialValidationResult

- Update mellea/core/__init__.py module docstring to mention PartialValidationResult
- Update mellea/core/requirement.py module docstring to mention PartialValidationResult
- Document ValidationResult/PartialValidationResult API asymmetry in class docstring
- Add runtime validation of success parameter (fail fast on invalid values)
- Expand as_bool() docstring with streaming-context warning for "unknown" → False
- Add __repr__ for opaque debug output in streaming pipelines
- Replace test_as_bool_matches_bool loop with parametrised test_as_bool_correctness
- Add test_invalid_success_raises and test_repr_shows_state

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…acy and curly-quote test

- Fix misleading comment on _SENTENCE_BOUNDARY: was "processed by re engine
  as \u escapes" but the file contained literal Unicode chars. Now uses
  chr(0x201d) + chr(0x2019) for Python 3.12 compatibility (U+2019 is treated
  as a string delimiter in single-quoted raw strings on 3.12).
- Add test_sentence_chunker_curly_quotes to verify U+201D/U+2019 matching.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
- Add __repr__ to ValidationResult for parity with PartialValidationResult
- Add test_thunk_field and test_context_field to close the test coverage gap
  for keyword-only constructor arguments (previously only default-None was
  verified)

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
@github-actions github-actions Bot added the enhancement New feature or request label Apr 24, 2026
Merges feat/898-partial-validation-result (PartialValidationResult) and
feat/899-chunking-strategy (ChunkingStrategy ABC) into a single base for
Wave 2 and beyond.

Assisted-by: Claude Code
@planetf1 planetf1 force-pushed the feat/900-stream-validate branch from 96f1919 to 0c030fb Compare April 24, 2026 12:18
planetf1 added a commit to planetf1/mellea that referenced this pull request Apr 24, 2026
…eam_validate

- Remove "In Phase 1" temporal qualifier from docstring — reworded to
  timeless statement about orchestrator responsibility
- Add type annotations (str, Backend, Context) to test subclass overrides
- Add idempotency test: multiple calls on the same Requirement instance
  leave state unchanged

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…mputing#923)

- Simplify _SENTENCE_BOUNDARY regex to use \u escapes instead of chr()
  concatenation (cleaner, same semantics, Python 3.12-safe)
- Document that SentenceChunker discards inter-sentence whitespace via lstrip()
- Add test_chunking_strategy_is_abstract to document the extension-point contract

Assisted-by: Claude Code
…#900)

Add an async `stream_validate(chunk, backend, ctx)` method to the base
`Requirement` class. The default implementation returns
`PartialValidationResult("unknown")`; subclasses override to inspect the
accumulated chunk and return `"pass"` or `"fail"` early.

Per the Phase 1 design: `"pass"` is informational and does not
short-circuit the final `validate()` call. The method must not mutate
`self` — state isolation is the orchestrator's responsibility.

Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…eam_validate

- Remove "In Phase 1" temporal qualifier from docstring — reworded to
  timeless statement about orchestrator responsibility
- Add type annotations (str, Backend, Context) to test subclass overrides
- Add idempotency test: multiple calls on the same Requirement instance
  leave state unchanged

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Prevents positional confusion and makes future parameter additions
to the signature non-breaking for existing subclass overrides.

Assisted-by: Claude Code
@planetf1 planetf1 force-pushed the feat/900-stream-validate branch from d922a2c to 58128a7 Compare April 24, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(core): add stream_validate() to Requirement

1 participant