[VPEX][3/8] Add local-env constraint fetch with offline cache#5826
Open
rugpanov wants to merge 5 commits into
Open
[VPEX][3/8] Add local-env constraint fetch with offline cache#5826rugpanov wants to merge 5 commits into
rugpanov wants to merge 5 commits into
Conversation
Contributor
Waiting for approvalCould not determine reviewers from git history. Eligible reviewers: Suggestions based on git history. See OWNERS for ownership rules. |
This was referenced Jul 3, 2026
Collaborator
Integration test reportCommit: 79890ca
21 interesting tests: 14 SKIP, 5 RECOVERED, 2 flaky
Top 5 slowest tests (at least 2 minutes):
|
5fdb8b6 to
2092138
Compare
d867d1f to
64d8996
Compare
2092138 to
22d1137
Compare
64d8996 to
3fd0bfe
Compare
3fd0bfe to
ce01647
Compare
8d309ae to
d9b9c50
Compare
797c4ac to
03b4a2b
Compare
d9b9c50 to
77ac2c8
Compare
03b4a2b to
7eaf270
Compare
Third in the stacked local-env series. constraints.go fetches the per-environment pyproject.toml for a resolved env key and parses out requires-python, the databricks-connect pin, and the [tool.uv] constraint-dependencies. It caches each fetch on disk and classifies failures: a 404 is E_ENV_UNSUPPORTED (a resolvable target with no published environment, no cache fallback), while a transport or non-404 HTTP failure is E_FETCH and falls back to the last-good cached copy. The fetched body is validated by parseConstraints before it is written to the cache, so a malformed 2xx response cannot poison the cache and break a later transport-failure run that would otherwise serve the bad copy. Depends on the foundation PR for NewError and the E_FETCH / E_ENV_UNSUPPORTED codes. Still dormant. Co-authored-by: Isaac
…ache, exact dep match Review of the constraint-fetch layer surfaced several defects, all fixed here: - parseConstraints accepted valid-but-empty TOML: a 200 body with no [project].requires-python returned an empty result that would be cached and only fail confusingly later. It now errors when requires-python is absent. - The cache write assumed cacheDir already existed. On a fresh machine the first fetch succeeded but os.WriteFile failed (no such directory), so the cache never populated and offline runs got E_FETCH. writeCacheAtomic now MkdirAll's the parent. - The cache write was non-atomic (os.WriteFile truncates in place), so a concurrent transport-failure reader could observe a partial file. writeCacheAtomic writes a temp file and renames it into place. - databricks-connect detection used a bare string prefix, so a sibling like "databricks-connectors==1.0" matched first and was returned instead of the real pin. isDatabricksConnectDep now requires a package-name boundary. - sanitizeEnvKey collapsed only "/", leaving a Windows "\" to be treated as a separator by filepath.Join; it now collapses both. Co-authored-by: Isaac
Round-2 review noted that mapping an env key to a cache filename by replacing "/" with "__" was not injective: distinct keys like "a/b" and "a__b" collided on the same file, so a cached copy for one environment could be served for another on a transport-failure fallback. cacheFileName now appends a short sha256 of the raw env key to the readable slug, guaranteeing distinct keys get distinct files (env keys are internally generated and never actually collide today, but the cache should not depend on that). Co-authored-by: Isaac
Round-3 review noted isDatabricksConnectDep was case-sensitive, but Python package names are case-insensitive (PEP 503): a valid entry like "Databricks-Connect==16.4.0" went undetected, leaving the pin empty. The match now lowercases the entry before comparing; the caller still stores the original casing. Co-authored-by: Isaac
Round-4 review found isDatabricksConnectDep matched only case, missing PEP 503 name equivalence: "databricks_connect" and "databricks.connect" (and other whitespace) were not recognized. The check now extracts the leading package name up to the first PEP 508 delimiter and compares it under PEP 503 normalization (lowercase, collapse runs of -, _, . to a single -), so all spellings match while a distinct package like databricks-connectors does not. Co-authored-by: Isaac
77ac2c8 to
436dabf
Compare
7eaf270 to
79890ca
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
PR 3 of 8 in the stacked
databricks local-env python syncseries (see #5823 for the full plan). Stacked on #5824 — review #5823 and #5824 first; this PR's diff is onlyconstraints.go+ its test.What this PR contains
constraints.go— constraint fetch with an offline cache. Fetches the per-environmentpyproject.tomlfor a resolved env key and parses outrequires-python, thedatabricks-connectpin, and the[tool.uv]constraint-dependencies.Failure classification:
E_ENV_UNSUPPORTED(a resolvable target with no published environment; no cache fallback, since a missing key is a distinct non-transient condition).E_FETCH, falling back to the last-good cached copy.The fetched body is validated by
parseConstraintsbefore it is written to the cache, so a malformed 2xx response cannot poison the cache and break a later transport-failure run that would otherwise serve the bad copy.Dependencies & surface
Depends on the foundation PR (#5823) for
NewErrorand theE_FETCH/E_ENV_UNSUPPORTEDcodes. Still dormant — nothing imports the package yet.This pull request and its description were written by Isaac.