Goldilocks by TomWambsgans · Pull Request #210 · leanEthereum/leanVM

TomWambsgans · 2026-05-04T14:02:29Z

No description provided.

Co-authored-by: Copilot <copilot@github.com>

Bring main's MTU-XMSS structure (tweak table, public_param, T-Sponge with replacement) into the goldilocks branch with all poseidon-related sizes halved: field-element widths main (KoalaBear) goldilocks ------------------ ----------------- ---------- TWEAK_LEN 2 1 XMSS_DIGEST_LEN 4 2 RANDOMNESS_LEN_FE 6 3 MESSAGE_LEN_FE 8 4 PUBLIC_PARAM_LEN_FE 4 2 POSEIDON1_WIDTH 16 8 DIGEST_LEN_FE 8 4 Tweak table slots are 2 FE (1 actual tweak FE + 1 zero pad). The packed tweak fits in a single 64-bit Goldilocks element via `(tweak_type << 42) | (sub_position << 32) | index`. Port main's poseidon precompile features (`half_output`, `hardcoded_offset_left`) from Poseidon16 to Poseidon8, with new committed columns for the flags and `effective_index_left_first/second`. The half-output trace tail values are filled in a post-pass from `memory_padded` (lookup-only — the AIR doesn't constrain them). Encoding decomposition uses the goldilocks-proven 21 chunks of W=3 bits per FE with a factored 1-bit canonical check `(diff)·(diff − 2^63) == 0`, applied to the first 2 of 4 output FE for exactly V = 42 chunks (no V_GRINDING). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brings main into the goldilocks branch. The bulk of the work was porting main's PR #223 (duplex-sponge Fiat-Shamir) to the Goldilocks field, since goldilocks never adopted it. Conflict resolutions of note: - AIR trait: kept main's `n_shift_columns` / shift-columns-first layout; dropped the `low_degree` feature (goldilocks removed it — the Goldilocks poseidon8 AIR uses direct x^7 constraints, not `low_degree_block`). - extension_op/air.rs: cubic (DIM=3) layout reordered shift-columns-first. - Duplex Challenger ported to Goldilocks (WIDTH=8, RATE=4, CAPACITY=4); added a `Permutation` trait to the `symetric` crate. - New `poseidon8_permute` precompile: AIR (flag_permute column, outputs_left/right, mutex constraints), trace gen, ISA, simplifier. - Duplex `fiat_shamir.py` rewritten for DIGEST_LEN=4. - poseidon8 MAX_LOG_N_ROWS lowered 21 -> 20: the permute variant widened the table by 5 columns, which would otherwise exceed the WHIR commitment surface cap. cargo fmt + clippy clean; full `cargo test --workspace` passes; `recursion --n 2` aggregation runs end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Conflicts resolved by adopting main's BusInteraction refactor (PR #228) while keeping Goldilocks-specific bits: - Ported poseidon_8/mod.rs to the new bus_interactions() API (renamed COL_FLAG→COL_MULTIPLICITY, COL_PRECOMPILE_DATA→COL_DOMAINSEP, POSEIDON_PRECOMPILE_DATA(=1)→POSEIDON_DOMAINSEP_BASE(=3); merged lookups() + bus() into bus_interactions()). - table_enum: kept Table::poseidon8(), adopted MAX_BUS_WIDTH/LOG_MAX_BUS_WIDTH. - extension_op air.rs: virtual cols at indices 21,22 (DIMENSION=3) with new names COL_MULTIPLICITY/COL_DOMAINSEP_EXTENSION_OP; switched to eval_bus_virtual. - verify_execution: kept *get_poseidon8(), added MAX_BYTECODE_LOG_SIZE check. - Renamed poseidon_16 references → poseidon_8 in test (prove_poseidon_8.rs). - Removed the spurious poseidon_16/ directory left in the tree. - Adjusted recursion.py to use copy_ef instead of copy_5 (DIM=3).

# Conflicts: # crates/lean_vm/src/tables/extension_op/air.rs # crates/lean_vm/src/tables/poseidon_16/mod.rs

Adapt the always-IV slice-hashing scheme (length absorbed into the IV for domain separation) to the Goldilocks Poseidon8 permutation (width 8, rate 4, digest 4) instead of KoalaBear Poseidon16 (width 16, rate 8, digest 8). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # crates/lean_compiler/snark_lib.py # crates/lean_compiler/tests/test_compiler.rs # crates/lean_compiler/tests/test_data/program_166.py # crates/lean_compiler/zkDSL.md # crates/rec_aggregation/zkdsl_implem/hashing.py # crates/rec_aggregation/zkdsl_implem/main.py

# Conflicts: # crates/lean_prover/src/lib.rs # crates/lean_prover/src/test_zkvm.rs

# Conflicts: # crates/backend/fiat-shamir/src/challenger.rs # crates/backend/fiat-shamir/tests/grinding.rs # crates/backend/sumcheck/src/product_computation.rs # crates/lean_prover/src/verify_execution.rs # crates/rec_aggregation/src/bytecode_claims.rs # crates/rec_aggregation/src/type_2_aggregation.rs # crates/rec_aggregation/zkdsl_implem/fiat_shamir.py # crates/rec_aggregation/zkdsl_implem/main.py # crates/sub_protocols/src/quotient_gkr/mod.rs # crates/utils/src/wrappers.rs # crates/whir/tests/run_whir.rs

Adapt main's column/flag renames (e.g. POSEIDON_*COL_INDEX_INPUT_LEFT -> POSEIDON_*COL_NU_A, EXT_OP_FLAG_MUL -> EXT_OP_FLAG_DOT_PRODUCT, ExtensionOp::PolyEq -> Eq, COL_COMP -> COL_ACC, etc.) to the goldilocks-specific code that uses Poseidon8 and cubic (DIMENSION=3) extension. Drop the KoalaBear-targeted python verifier and its check_whir_configs test, which don't apply to the goldilocks branch (folding_pow_bits was removed in goldilocks; WHIR_CONFIGS and Fp primitives are KoalaBear-specific). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adopt main's overwrite (permutation-based) sponge hashing on the Goldilocks branch, keeping Goldilocks field types throughout (WIDTH 8, RATE 4, DIGEST 4, poseidon8, cubic extension). Key reconciliations: - utils/symetric/whir/fiat-shamir: overwrite sponge (hash_slice_rtl, precompute_zero_suffix_state), poseidon_hash_slice, two-perm merkle_verify. - Poseidon table: kept the Goldilocks x^7 permutation AIR (sparse partial rounds) but adopted main's I/O interface, halved: 3-way output gating via flag_out2/flag_out4, added permute_half, unified Davies-Meyer output gates. New precompile set: poseidon8_compress_half/_quarter (+_hardcoded_left), poseidon8_permute/_permute_half (+_hardcoded_left). - Compiler, instruction encoder/display, prover trace post-pass updated to the new flags and names. - zkDSL verifier (hashing.py, main.py, xmss_aggregate.py) and XMSS signer (wots.rs) switched to the overwrite sponge; encoding decomposition and gl-specific constants (copy_ef/copy_digest, NUM_ENCODING_FE, strides) kept. - Dropped python-verifier (removed on goldilocks). Validated: workspace builds, fmt+clippy clean, poseidon AIR proves/verifies, compiler + lean_prover + xmss + sub_protocols tests pass, aggregation bytecode compiles, and end-to-end recursive XMSS aggregation proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adopt main's rec_aggregation rename (type 1/2 -> single/multi-message) and the new bytecode-claim Fiat-Shamir handling, keeping Goldilocks field types/sizes. Reconciliations: - bytecode_claims.rs: adopt main's direct claim ingestion into Fiat-Shamir (build_bytecode_claims_ingested_by_fiatshamir + observe_scalars, dropping hash_bytecode_claims), but keep Goldilocks poseidon8 (get_poseidon8). - multi_message_aggregation.rs: adopt build_multi_message_input_data name, keep DIGEST_LEN-generic layout comment. - zkdsl main.py: adopt single/multi-message naming and main's direct-ingestion reduce_bytecode_claims, but keep Goldilocks DIGEST_LEN-generic copy_digest loops (instead of main's hardcoded copy_8/copy_32) and SINGLE_MESSAGE flag placeholders provided by compilation.rs. - zkdsl hashing.py: slice_hash_continue uses poseidon8_permute_half (not the KoalaBear poseidon16 variant that auto-merged in). Validated: workspace builds, fmt+clippy clean, cargo testall passes, and end-to-end recursion (n=2) proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…idon1-8 Implemented and measured the Appendix-B sparse partial-round decomposition (the same one the AIR/trace-gen and the KoalaBear-16 permutation use) in the AVX-512 permutation. It is ~13% slower for Goldilocks: this circulant MDS has tiny entries {1,3,4,7,8,9} that strength-reduce to shifts/adds and batch 8 terms into a single reduce128 per output, while the sparse form needs arbitrary-constant 64x64 multiplies (one reduce128 each → 15 vs 8 reductions per partial round). Reverted the implementation, kept a comment so the dead end isn't re-explored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

cubic_mul_generic is the hottest field op in the prover after the poseidon permutation (~15% of an xmss prove, via sumcheck eq-eval and the poseidon AIR constraint eval). On Goldilocks each multiply carries a 128->64-bit reduction (the dominant cost), so 3-term Karatsuba trading 3 of the 9 multiplies for cheap field adds/subs is a net win across all packed backends. Measured: xmss --n-signatures 1550 --log-inv-rate 1 goes 392-394 -> 400-402 XMSS/s (~2%). Verified against the schoolbook reference (10k scalar + 2k packed random inputs) and end-to-end recursion still proves+verifies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

TomWambsgans and others added 30 commits April 15, 2026 18:18

reduce degree AIR poseidon

3714bb6

wip

68e4e4c

wip

b9f7c21

test_plonky3_compatibility

bb7be6f

wip

82c624e

wip

d1c525f

degree 7 air (instead of 3) for poseidon

beaf0d6

w

c7448bc

wip

4b78c6e

wip

ae2401d

w

7771454

w

ff61a47

wip

ec26c52

w

4d91224

w

1daffe2

Merge branch 'main' into goldilocks

abc56ce

Co-authored-by: Copilot <copilot@github.com>

w

a635928

Co-authored-by: Copilot <copilot@github.com>

fix

ec4acbd

Merge remote-tracking branch 'origin/goldilocks' into goldilocks

26e06b9

low level optis

1663a9e

w

84f208b

w

80b3a98

2x faster poseidon

89a2dc5

much faster poseidn on avx512

6efc061

Merge commit 'a6f398eb3841acc74e424b788c0c50fd64df26f5' into goldilocks

7baaf62

w

c308fb6

better encoding

0470d7a

clippy

4c1209a

f

086ab06

TomWambsgans force-pushed the goldilocks branch from 3aea9eb to dcef38d Compare May 14, 2026 06:08

TomWambsgans force-pushed the main branch from b01b199 to 0295672 Compare May 19, 2026 17:38

TomWambsgans force-pushed the goldilocks branch from 48a27e6 to 357c947 Compare May 19, 2026 17:41

log_size_guess = 19

e3432ae

TomWambsgans force-pushed the goldilocks branch from 357c947 to e3432ae Compare May 19, 2026 17:42

TomWambsgans force-pushed the main branch from 0295672 to 13408cc Compare May 19, 2026 17:59

TomWambsgans and others added 7 commits May 21, 2026 15:03

Merge remote-tracking branch 'origin/main' into goldilocks

d62254c

# Conflicts: # crates/lean_vm/src/tables/extension_op/air.rs # crates/lean_vm/src/tables/poseidon_16/mod.rs

wip

78becaf

python formating / file renaming

59d9d25

Merge remote-tracking branch 'origin/main' into goldilocks

d1476e2

Merge remote-tracking branch 'origin/HEAD' into goldilocks

961d2d6

TomWambsgans force-pushed the main branch from eacd019 to 9b2f632 Compare May 25, 2026 00:11

TomWambsgans and others added 5 commits May 26, 2026 01:46

Merge remote-tracking branch 'origin/main' into goldilocks

23537b0

# Conflicts: # crates/lean_prover/src/lib.rs # crates/lean_prover/src/test_zkvm.rs

clippy

dd0f092

TomWambsgans force-pushed the main branch 2 times, most recently from c5a3050 to 9dc5d68 Compare May 28, 2026 12:02

TomWambsgans and others added 8 commits May 29, 2026 01:06

Merge remote-tracking branch 'origin/main' into goldilocks

0feb0a5

recursion program: improve decompose_and_verify_merkle_query

5cf7027

Merge branch 'main' into goldilocks

85c4791

fix SIMD on permutation

fed15c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goldilocks#210

Goldilocks#210
TomWambsgans wants to merge 67 commits into
mainfrom
goldilocks

TomWambsgans commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TomWambsgans commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant