perf: fast-path strict json imports#840
Merged
stephenamar-db merged 2 commits intoMay 11, 2026
Merged
Conversation
Motivation: Kube-prometheus imports large strict JSON files; parsing them through the full Jsonnet parser creates avoidable AST and materialization work. Modification: Add a strict .json import fast path shared by CachedResolver and Preloader, trim visitor work, build race-free inline-array objects for imported JSON, and reuse the no-offset position for imported literals. Result: Strict JSON imports keep Jsonnet fallback semantics for non-strict inputs while reducing parse and manifestation overhead for large imported JSON data.
Motivation: Preloaded importstr and importbin entries for the same path were separated in Preloader, but CachedImporter still keyed reads only by Path. Interpreter evaluation could then reuse a text resolved file for binary reads or the reverse. Modification: Key CachedImporter entries by both Path and binaryData, update the top-level source cache insertion to use the text key, and add interpreter-level regression coverage for both import orders. Result: Preloaded text and binary imports for the same path remain independent through normal Interpreter evaluation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
stephenamar-db
approved these changes
May 11, 2026
This was referenced May 12, 2026
stephenamar-db
pushed a commit
that referenced
this pull request
May 18, 2026
Motivation: PR #840 added a strict JSON import fast path, but cached file imports still decoded every small file to a UTF-8 `String` before strict `.json` parsing. Real-world Jsonnet projects such as kube-prometheus import many JSON files, so decoding bytes to text only to parse them as UTF-8 JSON adds avoidable allocation and CPU cost. Key Design Decision: Use `ujson.ByteArrayParser` for strict `.json` imports and keep small cached resolved files byte-backed until text is explicitly needed by `importstr` or Fastparse. This keeps the JSON path byte-native while preserving existing text behavior for Jsonnet source parsing. Modification: * `CachedResolvedFile` now caches small resolved files as `Array[Byte]`, lazily materializes text for `getParserInput` / `readString`, and returns cached bytes directly for `readRawBytes`. * `CachedResolver.parseJsonImport` now parses strict JSON imports with `ujson.ByteArrayParser.transform(content.readRawBytes(), visitor)`. * `PreloaderTests` now assert strict JSON preload does not fall back to Fastparse or decode via `readString`. Result: Output equality was preserved against upstream sjsonnet and source-built jrsonnet: * kube-prometheus realworld output: `7,506,029` bytes, byte-identical. * `large_string_template` guard output: `549,674` bytes, byte-identical. Benchmark Results: Native whole-process kube-prometheus, command run from `jrsonnet/tests/realworld` with `-J vendor`, `hyperfine --shell=none --warmup 5 --runs 50`: | order | upstream/master | this PR | jrsonnet | |---|---:|---:|---:| | forward | `143.351 +/- 2.556 ms` | `135.806 +/- 3.388 ms` | `91.752 +/- 7.863 ms` | | reverse | `145.082 +/- 2.988 ms` | `136.392 +/- 2.034 ms` | `91.365 +/- 1.419 ms` | Native guard, `bench/resources/cpp_suite/large_string_template.jsonnet`: | upstream/master | this PR | jrsonnet | |---:|---:|---:| | `11.036 +/- 0.807 ms` | `10.762 +/- 0.650 ms` | `5.204 +/- 0.511 ms` | JMH guard, `bench.runRegressions`: | benchmark | upstream/master | this PR | |---|---:|---:| | `cpp_suite/large_string_template.jsonnet` | `0.757 ms/op` | `0.691 ms/op` | | `go_suite/manifestJsonEx.jsonnet` | `0.055 ms/op` | `0.054 ms/op` | Analysis: The target workload improves by about 5-6% in both command orders. The remaining gap to jrsonnet is not eliminated, but this change removes one clear extra decode from strict JSON import handling without regressing the non-target renderer guard or JMH guards. Validation: * `./mill --no-server --ticker false --color false -j 1 __.checkFormat` * `./mill --no-server --ticker false --color false -j 1 __.test` (`Tests: 440, Passed: 440, Failed: 0`) References: Follow-up to #840
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation:
Real-world workloads such as kube-prometheus import large strict JSON files. Parsing those files through the full Jsonnet parser creates avoidable parser, AST, and manifestation work.
Key Design Decision:
The fast path only accepts strict
.jsoninput and falls back to normal Jsonnet parsing for malformed JSON, duplicate keys, non-finite numbers, parser-depth overflow, and defensive numeric parse failures. Imported JSON objects use a race-free inline-arrayVal.Objlayout: field caching is disabled for parse-cache-shared literals, and single-field objects avoid lazyvalue0mutation. JSON value positions reusefileScope.noOffsetPosbecause strict JSON imports do not execute code and per-element positions are not consumed for stack traces.Modification:
Add a shared strict JSON import visitor in
CachedResolver, wire it intoPreloader, trim duplicate-key/string visitor work, use inline object layout for imported JSON objects, and add shared-cache/concurrency regression tests.Benchmark Results:
entry-kube-prometheus.jsonnet -J vendor)223.351 +/- 12.742 ms139.041 +/- 2.847 ms-37.75%~235 msstarting stack135.9 +/- 1.2 ms~ -42%cumulative88.087 +/- 1.737 ms, remaining gap1.54x.manifestJsonEx 0.053,realistic2 39.485,large_string_template 1.102,gen_big_object 0.801Analysis:
The PR keeps Jsonnet compatibility by treating the JSON parser as an optional strict fast path, not a replacement parser. The added JVM-only concurrency tests exercise shared parse-cache reuse; cross-platform JSON import tests cover the platform-neutral behavior.
References:
Source exploration commits: He-Pin/sjsonnet
995b0822,114bd69f,52e22696,ca6f24d5.Result:
Local
./mill --no-server -j 1 __.reformatand./mill --no-server -j 1 __.testpassed on this split branch (2066/2066).