Skip to content

Encode state-level city metadata as "null" string#369

Open
leekahung wants to merge 2 commits into
mainfrom
fix-metadata-null-city
Open

Encode state-level city metadata as "null" string#369
leekahung wants to merge 2 commits into
mainfrom
fix-metadata-null-city

Conversation

@leekahung

Copy link
Copy Markdown
Contributor

What type of PR is this? (check all applicable)

  • Bug Fix

Description

Vertex AI Search drops struct_data fields whose value is JSON null on import. State-level law docs were generated with {"city": null}, so they landed in the datastore with no city field at all. The retriever filters those docs with city: ANY("null"), which matches the literal string "null" — not a missing field — so state-level statutes (ORS 90, 91, 105, etc.) returned zero results.

This emits the literal string "null" for state-level docs instead of JSON null, so the metadata matches the filter.

Note: this fixes the generator for future corpus builds. The currently-affected datastore (city-state-law-search-05082026-edition) was already repaired in place by patching its bucket metadata.jsonl and re-importing.

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

cd backend && uv run pytest tests/test_generate_metadata_jsonl.py — all pass. State-level entries now serialize as {"city": "null", "state": "or"}.

Added/updated tests?

  • Yes

Documentation

  • If this PR changes the system architecture, Architecture.md has been updated

Vertex AI Search drops JSON null struct fields on import, so state-level
docs lost their city field and the city: ANY("null") retriever filter
matched nothing. Emit the literal string "null" instead.
@leekahung leekahung requested a review from yangm2 June 25, 2026 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant