Test type constraints separately#327
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThis PR renames the tag type from SystemString64 to TagString and refactors CasualString/CasualString128, updates all tagging-related router parameters and TagInfo responses to use TagString, replaces many plain int/string IDs with Identifier in database helper signatures, adds TypeAdapter tests for Identifier/TagString/CasualString, and removes one endpoint-level invalid-tag API test. Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
tests/types_test.py,test_identifier_rejects_negativecurrently validates the string "0" (same astest_identifier_rejects_zero); change this to a true negative value (e.g. "-1") or consolidate the zero/negative cases to avoid duplicated/incorrect coverage. - The parametrization of
_invalid_punctuation_taguses aset, which leads to non-deterministic test order and IDs; consider converting it to a sorted list before passing it topytest.mark.parametrizefor stable, reproducible parametrization.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `tests/types_test.py`, `test_identifier_rejects_negative` currently validates the string "0" (same as `test_identifier_rejects_zero`); change this to a true negative value (e.g. "-1") or consolidate the zero/negative cases to avoid duplicated/incorrect coverage.
- The parametrization of `_invalid_punctuation_tag` uses a `set`, which leads to non-deterministic test order and IDs; consider converting it to a sorted list before passing it to `pytest.mark.parametrize` for stable, reproducible parametrization.
## Individual Comments
### Comment 1
<location path="tests/types_test.py" line_range="23-25" />
<code_context>
+ _identifier.validate_strings("1.2")
+
+
+def test_identifier_rejects_negative() -> None:
+ with pytest.raises(ValidationError):
+ _identifier.validate_strings("0")
+
+
</code_context>
<issue_to_address>
**issue (testing):** The `test_identifier_rejects_negative` test currently checks `"0"`, making it duplicate `test_identifier_rejects_zero` and not actually testing negatives.
To match the test’s intent and avoid duplicating `test_identifier_rejects_zero`, update the value to an actual negative (e.g. `"-1"`) while keeping zero covered by the existing dedicated test.
</issue_to_address>
### Comment 2
<location path="tests/types_test.py" line_range="47-48" />
<code_context>
+ assert _tag_string.validate_strings(tag) == tag
+
+
+@pytest.mark.parametrize("tag", ["", "c" * 65, *_invalid_punctuation_tag])
+def test_tag_string_rejects_invalid(tag: str) -> None:
+ with pytest.raises(ValidationError):
+ _tag_string.validate_strings(tag)
</code_context>
<issue_to_address>
**suggestion (testing):** Add explicit invalid tag cases for whitespace (e.g. leading/trailing spaces) to mirror the previously tested edge cases.
The old API tests covered tags with whitespace (e.g. " a", "a b", "a\t"), but the new parametrization doesn’t. Since `string.punctuation` excludes whitespace, please add a few explicit whitespace-containing tags to maintain the previous validation coverage.
```suggestion
@pytest.mark.parametrize(
"tag",
[
"",
"c" * 65,
*_invalid_punctuation_tag,
" a",
"a ",
"a b",
"a\t",
],
)
def test_tag_string_rejects_invalid(tag: str) -> None:
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/types_test.py`:
- Around line 23-31: The test test_identifier_rejects_negative is incorrectly
reusing the zero input "0" instead of a negative value; update the assertion in
test_identifier_rejects_negative to call _identifier.validate_strings with a
negative string (e.g., "-1") so it raises ValidationError and provides
negative-boundary coverage for _identifier.validate_strings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dbca1d9d-ab09-48a8-9cdc-f8bc5eb577cf
📒 Files selected for processing (6)
src/routers/openml/datasets.pysrc/routers/openml/setups.pysrc/routers/openml/tasks.pysrc/routers/types.pytests/routers/openml/dataset_tag_test.pytests/types_test.py
💤 Files with no reviewable changes (1)
- tests/routers/openml/dataset_tag_test.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #327 +/- ##
==========================================
+ Coverage 94.40% 94.48% +0.07%
==========================================
Files 67 68 +1
Lines 3112 3157 +45
Branches 229 229
==========================================
+ Hits 2938 2983 +45
Misses 111 111
Partials 63 63 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/types_test.py (1)
23-30:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
test_identifier_rejects_negativenever tests a negative value and duplicatestest_identifier_rejects_zero.Both tests call
_identifier.validate_strings("0").test_identifier_rejects_negativeshould use a negative input (e.g."-1") to provide the coverage its name promises.🐛 Proposed fix
def test_identifier_rejects_negative() -> None: with pytest.raises(ValidationError): - _identifier.validate_strings("0") + _identifier.validate_strings("-1")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/types_test.py` around lines 23 - 30, The test named test_identifier_rejects_negative is incorrectly passing the string "0" and duplicating test_identifier_rejects_zero; update test_identifier_rejects_negative to call _identifier.validate_strings with a negative value such as "-1" so it actually asserts ValidationError for negative inputs and provides distinct coverage from test_identifier_rejects_zero.
🧹 Nitpick comments (4)
src/database/studies.py (1)
153-159: 💤 Low value
attach_tasksstill usesintforstudy_id/task_idswhileattach_task(line 118) was updated toIdentifier.✨ Proposed fix
async def attach_tasks( *, - study_id: int, - task_ids: list[int], + study_id: Identifier, + task_ids: list[Identifier], user: User, connection: AsyncConnection, ) -> None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/database/studies.py` around lines 153 - 159, The attach_tasks function signature and internal types should use the same Identifier type as attach_task: change study_id: int to study_id: Identifier and task_ids: list[int] to task_ids: list[Identifier], update any internal variables/annotations that assume ints accordingly, and ensure Identifier is imported/available in this module (mirror how attach_task declares and uses Identifier).tests/types_test.py (1)
62-64: 💤 Low valueMinor: parametrize parameter
"string"shadows thestringstdlib import; assertion style is inconsistent with the TagString tests.Two small nits:
- Using
"string"as the parametrize label (and function parameter name) shadows theimport stringat line 1 — use"value"or"s"to avoid confusion.assert _casual_string.validate_strings(string)checks truthiness, whiletest_tag_string_accepts_valid(line 44) checks equality with== tag. Prefer== stringfor symmetry.✨ Proposed fix
-@pytest.mark.parametrize("string", ["a", "a" * 1000, "_-.(),"]) -def test_casual_string_accepts_valid(string: str) -> None: - assert _casual_string.validate_strings(string) +@pytest.mark.parametrize("value", ["a", "a" * 1000, "_-.(),"]) +def test_casual_string_accepts_valid(value: str) -> None: + assert _casual_string.validate_strings(value) == value -@pytest.mark.parametrize("string", ["", *_invalid_punctuation_casual_string]) -def test_casual_string_rejects_invalid(string: str) -> None: +@pytest.mark.parametrize("value", ["", *_invalid_punctuation_casual_string]) +def test_casual_string_rejects_invalid(value: str) -> None: with pytest.raises(ValidationError): - _casual_string.validate_strings(string) + _casual_string.validate_strings(value)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/types_test.py` around lines 62 - 64, The test test_casual_string_accepts_valid uses the param name "string" which shadows the stdlib import and asserts truthiness; rename the parametrize label and function parameter to something like "value" (or "s") and update the assertion to compare equality (assert _casual_string.validate_strings(value) == value) to match the style of test_tag_string_accepts_valid and avoid the shadowing; locate uses of _casual_string.validate_strings and the test function name to apply the change.src/database/datasets.py (1)
98-119: ⚡ Quick win
tagis not updated while the symmetricdelete_tag(line 71) was — parameters useint/strinstead ofIdentifier/TagString.✨ Proposed fix
-async def tag(id_: int, tag_: str, *, user_id: int, connection: AsyncConnection) -> None: +async def tag(id_: Identifier, tag_: TagString, *, user_id: Identifier, connection: AsyncConnection) -> None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/database/datasets.py` around lines 98 - 119, The tag function is passing raw id_/tag_ types (int/str) but should accept/unwrap the domain types Identifier and TagString like delete_tag does; update the tag(...) implementation to accept Identifier and TagString (or convert them) and pass their primitive values into the SQL parameters (dataset_id, tag, user_id) so the INSERT uses the same unwrapped values/keys as delete_tag; locate the tag function and ensure you convert Identifier -> id value and TagString -> string before building the parameters and executing the query, preserving the same IntegrityError handling.src/database/tasks.py (1)
55-69: ⚡ Quick win
get_input_for_task_typestill usesintwhile the siblingget_task_typewas updated toIdentifier.Both functions accept a task-type ID (
ttid), so the annotation should be consistent.✨ Proposed fix
-async def get_input_for_task_type(task_type_id: int, expdb: AsyncConnection) -> Sequence[Row]: +async def get_input_for_task_type(task_type_id: Identifier, expdb: AsyncConnection) -> Sequence[Row]:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/database/tasks.py` around lines 55 - 69, The get_input_for_task_type function currently types the task_type_id parameter as int while the related get_task_type uses Identifier; update get_input_for_task_type to accept Identifier instead of int, adjust any type annotations/imports (e.g., ensure Identifier is imported where get_input_for_task_type is defined) and keep the SQL parameter key ("ttid") and parameters={"ttid": task_type_id} unchanged so runtime behavior is identical; ensure the return typing (Sequence[Row]) remains the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@tests/types_test.py`:
- Around line 23-30: The test named test_identifier_rejects_negative is
incorrectly passing the string "0" and duplicating test_identifier_rejects_zero;
update test_identifier_rejects_negative to call _identifier.validate_strings
with a negative value such as "-1" so it actually asserts ValidationError for
negative inputs and provides distinct coverage from
test_identifier_rejects_zero.
---
Nitpick comments:
In `@src/database/datasets.py`:
- Around line 98-119: The tag function is passing raw id_/tag_ types (int/str)
but should accept/unwrap the domain types Identifier and TagString like
delete_tag does; update the tag(...) implementation to accept Identifier and
TagString (or convert them) and pass their primitive values into the SQL
parameters (dataset_id, tag, user_id) so the INSERT uses the same unwrapped
values/keys as delete_tag; locate the tag function and ensure you convert
Identifier -> id value and TagString -> string before building the parameters
and executing the query, preserving the same IntegrityError handling.
In `@src/database/studies.py`:
- Around line 153-159: The attach_tasks function signature and internal types
should use the same Identifier type as attach_task: change study_id: int to
study_id: Identifier and task_ids: list[int] to task_ids: list[Identifier],
update any internal variables/annotations that assume ints accordingly, and
ensure Identifier is imported/available in this module (mirror how attach_task
declares and uses Identifier).
In `@src/database/tasks.py`:
- Around line 55-69: The get_input_for_task_type function currently types the
task_type_id parameter as int while the related get_task_type uses Identifier;
update get_input_for_task_type to accept Identifier instead of int, adjust any
type annotations/imports (e.g., ensure Identifier is imported where
get_input_for_task_type is defined) and keep the SQL parameter key ("ttid") and
parameters={"ttid": task_type_id} unchanged so runtime behavior is identical;
ensure the return typing (Sequence[Row]) remains the same.
In `@tests/types_test.py`:
- Around line 62-64: The test test_casual_string_accepts_valid uses the param
name "string" which shadows the stdlib import and asserts truthiness; rename the
parametrize label and function parameter to something like "value" (or "s") and
update the assertion to compare equality (assert
_casual_string.validate_strings(value) == value) to match the style of
test_tag_string_accepts_valid and avoid the shadowing; locate uses of
_casual_string.validate_strings and the test function name to apply the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1eecdf68-3c97-4ef9-b03b-4bfcfad36c0a
📒 Files selected for processing (10)
src/database/datasets.pysrc/database/flows.pysrc/database/qualities.pysrc/database/runs.pysrc/database/setups.pysrc/database/studies.pysrc/database/tasks.pysrc/database/users.pysrc/routers/types.pytests/types_test.py
✅ Files skipped from review due to trivial changes (2)
- src/database/runs.py
- src/database/qualities.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/routers/types.py
Instead of testing type constraints through API calls, we directly invoke the pydantic validation to test them.
This much faster and provides a clearer signal on failure.