Skip to content

fix(dpp): remove erroneous keywords field from document-meta schema and fix contract keywords docs#3471

Open
thephez wants to merge 10 commits intov3.1-devfrom
docs/keyword-clarification
Open

fix(dpp): remove erroneous keywords field from document-meta schema and fix contract keywords docs#3471
thephez wants to merge 10 commits intov3.1-devfrom
docs/keyword-clarification

Conversation

@thephez
Copy link
Copy Markdown
Collaborator

@thephez thephez commented Apr 9, 2026

Issue being fixed or feature implemented

#2523 accidentally added a keywords field to the document type meta schema. The correct location is contract-level (DataContractV1.keywords). Neither DocumentTypeV0 nor DocumentTypeV1 has a keywords field, and try_from_schema never reads it — so document-type keywords were silently accepted and discarded.

Protocol v12 set document_type_schema: 1 (v4.rs:42), making the v1 meta schema the active validation path. The errant field needs to be removed from the v1 schema and its v12-migration allowlist in lockstep.

Separately, the DataContractV1 doc comment incorrectly stated the contract-level keyword limit as 20 when the validators and TooManyKeywordsError both enforce 50.

What was done?

  • Removed the keywords block from packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json (the active meta schema)
  • Removed "keywords" from ALLOWED_DOCUMENT_SCHEMA_V1_PROPERTIES in allowed_top_level_properties.rs — required to keep the allowlist in sync with the schema (enforced by the allowlist_matches_v1_meta_schema_properties parity test)
  • Left the v0 meta schema unchanged — it's a frozen historical schema and not on the active validation path
  • Fixed the doc comment on DataContractV1 to state the correct contract-level max of 50 keywords
  • Clarified inline validator comments to say "contract keywords"

How Has This Been Tested?

  • cargo test -p dpp — all 2462 dpp lib tests pass, including allowlist_matches_v1_meta_schema_properties
  • cargo fmt --all clean
  • No new tests added — change is a schema + allowlist removal backed by existing parity test coverage

Breaking Changes

Document-type keywords is no longer accepted by v1 meta-schema validation. Any existing stored contract that happens to have keywords on a document type will have it stripped during the v12 migration via strip_unknown_document_schema_properties. This is safe — keywords on document types has never been read by Rust, so stripping it cannot affect runtime semantics. Contract-level keywords (DataContractV1.keywords) are unaffected.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • Documentation

    • Clarified keyword field wording and updated the documented maximum from 20 to 50.
  • Changes

    • Removed the keywords property from the document meta schema; documents no longer accept a top-level keywords field.
  • Tests

    • Added a unit test ensuring document-type-level keywords are rejected by meta-schema validation.

@thephez thephez requested a review from QuantumExplorer as a code owner April 9, 2026 20:42
@github-actions github-actions Bot added this to the v3.1.0 milestone Apr 9, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 9, 2026

Warning

Rate limit exceeded

@thephez has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 51 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 56 minutes and 51 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7786e927-0fdd-416a-947e-bde3d1b12b65

📥 Commits

Reviewing files that changed from the base of the PR and between 3df26c1 and 73b972a.

📒 Files selected for processing (1)
  • packages/rs-drive-abci/src/execution/validation/state_transition/state_transitions/data_contract_create/mod.rs
📝 Walkthrough

Walkthrough

Updated inline docs/comments to reflect a 50-keyword limit for data contracts, removed keywords from the document meta-schema and document-type allowlist, and added a unit test that asserts document-type-level keywords are rejected by v1 meta-schema validation.

Changes

Cohort / File(s) Summary
Data contract comments & docs
packages/rs-dpp/src/data_contract/.../validate_update/v0/mod.rs, packages/rs-dpp/src/data_contract/v1/data_contract.rs, packages/rs-drive-abci/src/execution/validation/.../data_contract_create/basic_structure/v0/mod.rs
Updated inline comments/documentation to specify “contract keywords” and raise the documented maximum keywords from 20 → 50. No runtime logic changes.
Schema removal & allowlist update
packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json, packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs
Removed the keywords property from the document meta-schema and from the allowed top-level document schema properties; added unit test ensuring document-type keywords are rejected.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐇 I hopped through schemas, tidy and bright,
Pulled keywords from documents into contract light.
From twenty to fifty the list grows long,
Tests sing the chorus, the code hums a song.
🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main changes: removing the erroneous keywords field from the document-meta v1 schema and fixing contract keywords documentation. It directly reflects the primary objectives of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/keyword-clarification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@thephez thephez force-pushed the docs/keyword-clarification branch from 4390ccf to b98d374 Compare April 9, 2026 20:45
@thepastaclaw
Copy link
Copy Markdown
Collaborator

thepastaclaw commented Apr 9, 2026

🕓 Ready for review — 2 ahead in queue (commit 73b972a)
Queue position: 3/15

@QuantumExplorer
Copy link
Copy Markdown
Member

This might break testnet/mainnet (we will need to verify it doesn't)

Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-scoped PR that removes an erroneous keywords property from the document-meta JSON schema and fixes doc comments. The one substantive finding — modifying a runtime-compiled schema in place without protocol version gating — is technically valid but low practical risk. The doc comment nitpick is dropped because the author already considered and explicitly reverted a more verbose comment.

Reviewed commit: b98d374

🟡 1 suggestion(s)

1 additional finding

🟡 suggestion: In-place modification of runtime-compiled v0 meta-schema

packages/rs-dpp/schema/meta_schemas/document/v0/document-meta.json (lines 591-594)

Verified: DOCUMENT_META_JSON_V0 is loaded via include_str! at meta_validators/mod.rs:39 and compiled into DOCUMENT_META_SCHEMA_V0 at line 89. Both try_from_schema/v0 (line 136) and try_from_schema/v1 (line 155) use this single compiled schema to validate document type definitions at runtime.

The root meta-schema has no additionalProperties: false — the additionalProperties at line 591 is a property definition (constraining what document types must set), not a root-level restriction. So unknown properties were already silently accepted. The removed keywords definition only constrained malformed keyword values (wrong item types, >20 items, etc.).

During a rolling upgrade, a document type with malformed keywords (e.g. non-string items) would be rejected by old binaries but accepted by new ones — a theoretical consensus divergence. Practical risk is minimal since no code reads keywords from document types and the field was always discarded. Only one version of this schema exists (no v1), so there's no version-gating mechanism to leverage.

If the team's upgrade process guarantees atomic binary upgrades (all nodes update before processing new blocks), this is a non-issue. Otherwise, consider whether this warrants a versioned schema.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-dpp/schema/meta_schemas/document/v0/document-meta.json`:
- [SUGGESTION] lines 591-594: In-place modification of runtime-compiled v0 meta-schema
  Verified: `DOCUMENT_META_JSON_V0` is loaded via `include_str!` at `meta_validators/mod.rs:39` and compiled into `DOCUMENT_META_SCHEMA_V0` at line 89. Both `try_from_schema/v0` (line 136) and `try_from_schema/v1` (line 155) use this single compiled schema to validate document type definitions at runtime.

The root meta-schema has no `additionalProperties: false` — the `additionalProperties` at line 591 is a *property definition* (constraining what document types must set), not a root-level restriction. So unknown properties were already silently accepted. The removed `keywords` definition only constrained *malformed* keyword values (wrong item types, >20 items, etc.).

During a rolling upgrade, a document type with malformed `keywords` (e.g. non-string items) would be rejected by old binaries but accepted by new ones — a theoretical consensus divergence. Practical risk is minimal since no code reads keywords from document types and the field was always discarded. Only one version of this schema exists (no v1), so there's no version-gating mechanism to leverage.

If the team's upgrade process guarantees atomic binary upgrades (all nodes update before processing new blocks), this is a non-issue. Otherwise, consider whether this warrants a versioned schema.

@thephez
Copy link
Copy Markdown
Collaborator Author

thephez commented Apr 13, 2026

Maybe good to pair this change with #3475 in the same release? Prior to that PR, keywords (or any additional property) would be accepted, but that PR makes things more strict.

If AI correctly stated that extra document schema entries were actually just silently dropped, there shouldn't be breaking issues. But I'm not familiar enough with the inner workings to be sure on that so it should be reviewed by someone more knowledgeable.

Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The comment/doc cleanups are correct, but the schema fix is incomplete. This PR only removes the erroneous document-type keywords field from the v0 document meta-schema, while the current protocol path still validates document types against the v1 meta-schema that retains the same field.

Reviewed commit: 2b3585f

🔴 1 blocking

1 additional finding

🔴 blocking: The live document meta-schema still accepts document-level `keywords`

packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json (lines 591-600)

Removing keywords from document/v0/document-meta.json does not fix the active validation path. DocumentType::try_from_schema() selects DOCUMENT_META_SCHEMA_V1 whenever document_type_schema == 1, and that is the version used for the current protocol line, so contracts created today still validate successfully with document-level keywords because packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json still defines the field. The v12 migration allowlist in packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs also still treats keywords as a valid document-type property, so the cleanup needs to update the v1 schema and its synced allowlist together.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json`:
- [BLOCKING] lines 591-600: The live document meta-schema still accepts document-level `keywords`
  Removing `keywords` from `document/v0/document-meta.json` does not fix the active validation path. `DocumentType::try_from_schema()` selects `DOCUMENT_META_SCHEMA_V1` whenever `document_type_schema == 1`, and that is the version used for the current protocol line, so contracts created today still validate successfully with document-level `keywords` because `packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json` still defines the field. The v12 migration allowlist in `packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs` also still treats `keywords` as a valid document-type property, so the cleanup needs to update the v1 schema and its synced allowlist together.

@thephez thephez marked this pull request as draft April 16, 2026 13:01
thephez and others added 3 commits April 16, 2026 09:02
… comments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The field was added by mistake in #2523 — keywords belong to the
contract level (DataContractV1), not document types. DocumentTypeV0/V1
have no keywords field and try_from_schema never reads it, so the
schema entry was silently accepted then discarded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@thephez thephez force-pushed the docs/keyword-clarification branch from 2b3585f to 6d234e9 Compare April 16, 2026 13:02
thephez and others added 2 commits April 16, 2026 10:41
v0 is a frozen historical meta schema no longer on the active
validation path (protocol v12 switched to v1). Removing it from v0
had no runtime effect but risked diverging from what was validated
when legacy contracts were created. The real fix belongs on the v1
schema, which is applied in a follow-up commit.

Reverts the schema portion of 9e56d4d.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #2523 mistakenly placed keywords in the document-type meta schema
(correct location is contract-level, on DataContractV1). Neither
DocumentTypeV0 nor DocumentTypeV1 has a keywords field, and
try_from_schema never reads it.

v1 is the active meta schema (protocol v12 sets document_type_schema
to 1), so document-type keywords were still silently accepted in
new contracts after the earlier v0 fix. Removing the field from the
v1 schema and its v12-migration allowlist in lockstep — the parity
test (allowlist_matches_v1_meta_schema_properties) requires both to
change together.

Any existing stored contract that happens to have document-type
keywords will be cleanly stripped by the v12 migration in
strip_unknown_document_schema_properties; safe because keywords on
document types has never been read by Rust.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@thephez thephez marked this pull request as ready for review April 16, 2026 14:47
…on and stripping

Adds two tests targeting the behavior change of removing `keywords`
from the v1 document-type meta schema and its allowlist:

- `strips_keywords_from_document_schema` (rs-dpp): unit test that
  verifies the v12 migration `strip_unknown_properties_from_document_schema`
  removes a `keywords` key from a document-type schema.

- `test_document_type_keywords_rejected_by_v1_meta_schema` (drive-abci):
  integration test pinned to protocol v12 that injects `keywords` onto
  a document-type schema, calls `DataContract::from_value` with full
  validation, and asserts the failure is specifically a
  `BasicError::JsonSchemaError` whose summary references `keywords`
  (not any unrelated error).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@thephez thephez requested a review from thepastaclaw April 16, 2026 16:18
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 83.05085% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.00%. Comparing base (03f142b) to head (73b972a).
⚠️ Report is 1 commits behind head on v3.1-dev.

Files with missing lines Patch % Lines
...tion/state_transitions/data_contract_create/mod.rs 77.27% 10 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           v3.1-dev    #3471      +/-   ##
============================================
- Coverage     88.00%   88.00%   -0.01%     
============================================
  Files          2474     2474              
  Lines        294440   294499      +59     
============================================
+ Hits         259132   259181      +49     
- Misses        35308    35318      +10     
Components Coverage Δ
dpp 87.79% <100.00%> (+<0.01%) ⬆️
drive 87.27% <ø> (ø)
drive-abci 89.57% <77.27%> (-0.01%) ⬇️
sdk ∅ <ø> (∅)
dapi-client ∅ <ø> (∅)
platform-version ∅ <ø> (∅)
platform-value 92.26% <ø> (ø)
platform-wallet ∅ <ø> (∅)
drive-proof-verifier 55.66% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@thepastaclaw
Copy link
Copy Markdown
Collaborator

Validated the current head (3df26c16) and I do not think this needs another code change.\n\n- the earlier v0-only concern is addressed: v0 was restored, and the active v1 document meta-schema now drops keywords along with the synced v12 allowlist\n- existing stored contracts are still safe because the v12 migration strips unknown document-schema properties, including document-type keywords\n- Rust never reads document-type keywords on either the v0 or v1 DocumentType path, so the old field was only ever silently accepted/discarded\n\nLocal verification on this head:\n- cargo test -p dpp strips_keywords_from_document_schema ✅\n- cargo test -p drive-abci test_document_type_keywords_rejected_by_v1_meta_schema ✅\n\nSo from my side this looks ready for fresh review rather than more edits.

Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

I verified the referenced findings against detached HEAD e89e0fc640c16f0d71956f04f8b077d790e8877c. The functional change is consistent: keywords is removed from the v1 document-type meta-schema and the migration allowlist, and the new regression tests cover both rejection and cleanup. Two test-quality issues remain in the new create-state-transition test; the release-coordination concern is not a code defect that can be confirmed from this tree alone.

Reviewed commit: e89e0fc

💬 2 nitpick(s)

Comment on lines +4327 to +4335
// v1 meta schema enforcement.
let platform_version = PlatformVersion::get(12).expect("expected v12");
let mut platform = TestPlatformBuilder::new()
.build_with_mock_rpc()
.set_genesis_state();

let _platform_state = platform.state.load();
let (_identity, _signer, _key) =
setup_identity(&mut platform, 958, dash_to_credits!(1.0));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: The new regression test performs platform and identity setup that the assertion path never uses

test_document_type_keywords_rejected_by_v1_meta_schema only loads a contract JSON fixture, mutates its value, and calls DataContract::from_value(...). The preceding platform.state.load() result is discarded, and setup_identity(&mut platform, ...) drives an identity through the full platform pipeline even though neither the platform state nor the created identity participates in the schema-validation path under test. That extra setup makes the test slower and obscures its purpose without increasing coverage.

source: ['claude']

Comment on lines +4364 to +4372
ConsensusError::BasicError(BasicError::JsonSchemaError(js_err)) => {
// The rejection should be driven by `additionalProperties`
// / `unevaluatedProperties` at the meta-schema root, and
// the offending property name must be `keywords`.
let summary = js_err.error_summary();
assert!(
summary.contains("keywords"),
"expected JSON schema error to reference `keywords`, got: {summary}"
);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 Nitpick: The assertion is weaker than the error type allows and can pass on unrelated failures mentioning keywords

The test only checks js_err.error_summary().contains("keywords"). In this codebase JsonSchemaError exposes structured fields such as keyword(), instance_path(), and property_name(), so the test can assert that the failure came from the document-type schema rejecting an unexpected property rather than from any future validation error whose summary also happens to mention keywords. As written, the test proves that some schema error referenced the string, not that the intended additionalProperties or unevaluatedProperties rejection occurred at the injected document-type path.

source: ['claude']

thephez and others added 2 commits April 23, 2026 09:09
Assert structured `JsonSchemaError` fields (keyword + offending property
name in params) instead of a substring match on `error_summary`, and drop
the unused platform/identity setup since `DataContract::from_value` is a
pure DPP call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

I reviewed this as a focused schema-correction PR and did not find a concrete correctness regression in the checked-out change. The intended behavior is to stop accepting keywords on individual document-type schemas, align the active v1 schema/allowlist with the Rust implementation, and strip legacy stored occurrences during migration; that direction appears consistent and appropriately scoped.

Reviewed commit: 73b972a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants