Skip to content

Goldilocks#210

Open
TomWambsgans wants to merge 67 commits into
mainfrom
goldilocks
Open

Goldilocks#210
TomWambsgans wants to merge 67 commits into
mainfrom
goldilocks

Conversation

@TomWambsgans
Copy link
Copy Markdown
Collaborator

No description provided.

TomWambsgans and others added 30 commits April 15, 2026 18:18
Co-authored-by: Copilot <copilot@github.com>
w
Co-authored-by: Copilot <copilot@github.com>
Bring main's MTU-XMSS structure (tweak table, public_param, T-Sponge with
replacement) into the goldilocks branch with all poseidon-related sizes
halved:

  field-element widths    main (KoalaBear)   goldilocks
  ------------------    -----------------   ----------
  TWEAK_LEN                 2                 1
  XMSS_DIGEST_LEN           4                 2
  RANDOMNESS_LEN_FE         6                 3
  MESSAGE_LEN_FE            8                 4
  PUBLIC_PARAM_LEN_FE       4                 2
  POSEIDON1_WIDTH          16                 8
  DIGEST_LEN_FE             8                 4

Tweak table slots are 2 FE (1 actual tweak FE + 1 zero pad). The packed
tweak fits in a single 64-bit Goldilocks element via
`(tweak_type << 42) | (sub_position << 32) | index`.

Port main's poseidon precompile features (`half_output`,
`hardcoded_offset_left`) from Poseidon16 to Poseidon8, with new committed
columns for the flags and `effective_index_left_first/second`. The
half-output trace tail values are filled in a post-pass from
`memory_padded` (lookup-only — the AIR doesn't constrain them).

Encoding decomposition uses the goldilocks-proven 21 chunks of W=3 bits
per FE with a factored 1-bit canonical check
`(diff)·(diff − 2^63) == 0`, applied to the first 2 of 4 output FE for
exactly V = 42 chunks (no V_GRINDING).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings main into the goldilocks branch. The bulk of the work was porting
main's PR #223 (duplex-sponge Fiat-Shamir) to the Goldilocks field, since
goldilocks never adopted it.

Conflict resolutions of note:
- AIR trait: kept main's `n_shift_columns` / shift-columns-first layout;
  dropped the `low_degree` feature (goldilocks removed it — the Goldilocks
  poseidon8 AIR uses direct x^7 constraints, not `low_degree_block`).
- extension_op/air.rs: cubic (DIM=3) layout reordered shift-columns-first.
- Duplex Challenger ported to Goldilocks (WIDTH=8, RATE=4, CAPACITY=4);
  added a `Permutation` trait to the `symetric` crate.
- New `poseidon8_permute` precompile: AIR (flag_permute column,
  outputs_left/right, mutex constraints), trace gen, ISA, simplifier.
- Duplex `fiat_shamir.py` rewritten for DIGEST_LEN=4.
- poseidon8 MAX_LOG_N_ROWS lowered 21 -> 20: the permute variant widened
  the table by 5 columns, which would otherwise exceed the WHIR commitment
  surface cap.

cargo fmt + clippy clean; full `cargo test --workspace` passes;
`recursion --n 2` aggregation runs end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TomWambsgans and others added 7 commits May 21, 2026 15:03
Conflicts resolved by adopting main's BusInteraction refactor (PR #228)
while keeping Goldilocks-specific bits:
- Ported poseidon_8/mod.rs to the new bus_interactions() API (renamed
  COL_FLAG→COL_MULTIPLICITY, COL_PRECOMPILE_DATA→COL_DOMAINSEP,
  POSEIDON_PRECOMPILE_DATA(=1)→POSEIDON_DOMAINSEP_BASE(=3); merged
  lookups() + bus() into bus_interactions()).
- table_enum: kept Table::poseidon8(), adopted MAX_BUS_WIDTH/LOG_MAX_BUS_WIDTH.
- extension_op air.rs: virtual cols at indices 21,22 (DIMENSION=3) with
  new names COL_MULTIPLICITY/COL_DOMAINSEP_EXTENSION_OP; switched to
  eval_bus_virtual.
- verify_execution: kept *get_poseidon8(), added MAX_BYTECODE_LOG_SIZE check.
- Renamed poseidon_16 references → poseidon_8 in test (prove_poseidon_8.rs).
- Removed the spurious poseidon_16/ directory left in the tree.
- Adjusted recursion.py to use copy_ef instead of copy_5 (DIM=3).
# Conflicts:
#	crates/lean_vm/src/tables/extension_op/air.rs
#	crates/lean_vm/src/tables/poseidon_16/mod.rs
Adapt the always-IV slice-hashing scheme (length absorbed into the IV
for domain separation) to the Goldilocks Poseidon8 permutation
(width 8, rate 4, digest 4) instead of KoalaBear Poseidon16
(width 16, rate 8, digest 8).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TomWambsgans and others added 5 commits May 26, 2026 01:46
# Conflicts:
#	crates/lean_compiler/snark_lib.py
#	crates/lean_compiler/tests/test_compiler.rs
#	crates/lean_compiler/tests/test_data/program_166.py
#	crates/lean_compiler/zkDSL.md
#	crates/rec_aggregation/zkdsl_implem/hashing.py
#	crates/rec_aggregation/zkdsl_implem/main.py
# Conflicts:
#	crates/lean_prover/src/lib.rs
#	crates/lean_prover/src/test_zkvm.rs
# Conflicts:
#	crates/backend/fiat-shamir/src/challenger.rs
#	crates/backend/fiat-shamir/tests/grinding.rs
#	crates/backend/sumcheck/src/product_computation.rs
#	crates/lean_prover/src/verify_execution.rs
#	crates/rec_aggregation/src/bytecode_claims.rs
#	crates/rec_aggregation/src/type_2_aggregation.rs
#	crates/rec_aggregation/zkdsl_implem/fiat_shamir.py
#	crates/rec_aggregation/zkdsl_implem/main.py
#	crates/sub_protocols/src/quotient_gkr/mod.rs
#	crates/utils/src/wrappers.rs
#	crates/whir/tests/run_whir.rs
Adapt main's column/flag renames (e.g. POSEIDON_*COL_INDEX_INPUT_LEFT ->
POSEIDON_*COL_NU_A, EXT_OP_FLAG_MUL -> EXT_OP_FLAG_DOT_PRODUCT,
ExtensionOp::PolyEq -> Eq, COL_COMP -> COL_ACC, etc.) to the
goldilocks-specific code that uses Poseidon8 and cubic (DIMENSION=3)
extension. Drop the KoalaBear-targeted python verifier and its
check_whir_configs test, which don't apply to the goldilocks branch
(folding_pow_bits was removed in goldilocks; WHIR_CONFIGS and Fp
primitives are KoalaBear-specific).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TomWambsgans TomWambsgans force-pushed the main branch 2 times, most recently from c5a3050 to 9dc5d68 Compare May 28, 2026 12:02
TomWambsgans and others added 8 commits May 29, 2026 01:06
Adopt main's overwrite (permutation-based) sponge hashing on the Goldilocks
branch, keeping Goldilocks field types throughout (WIDTH 8, RATE 4, DIGEST 4,
poseidon8, cubic extension).

Key reconciliations:
- utils/symetric/whir/fiat-shamir: overwrite sponge (hash_slice_rtl,
  precompute_zero_suffix_state), poseidon_hash_slice, two-perm merkle_verify.
- Poseidon table: kept the Goldilocks x^7 permutation AIR (sparse partial
  rounds) but adopted main's I/O interface, halved: 3-way output gating via
  flag_out2/flag_out4, added permute_half, unified Davies-Meyer output gates.
  New precompile set: poseidon8_compress_half/_quarter (+_hardcoded_left),
  poseidon8_permute/_permute_half (+_hardcoded_left).
- Compiler, instruction encoder/display, prover trace post-pass updated to the
  new flags and names.
- zkDSL verifier (hashing.py, main.py, xmss_aggregate.py) and XMSS signer
  (wots.rs) switched to the overwrite sponge; encoding decomposition and
  gl-specific constants (copy_ef/copy_digest, NUM_ENCODING_FE, strides) kept.
- Dropped python-verifier (removed on goldilocks).

Validated: workspace builds, fmt+clippy clean, poseidon AIR proves/verifies,
compiler + lean_prover + xmss + sub_protocols tests pass, aggregation bytecode
compiles, and end-to-end recursive XMSS aggregation proves+verifies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adopt main's rec_aggregation rename (type 1/2 -> single/multi-message) and the
new bytecode-claim Fiat-Shamir handling, keeping Goldilocks field types/sizes.

Reconciliations:
- bytecode_claims.rs: adopt main's direct claim ingestion into Fiat-Shamir
  (build_bytecode_claims_ingested_by_fiatshamir + observe_scalars, dropping
  hash_bytecode_claims), but keep Goldilocks poseidon8 (get_poseidon8).
- multi_message_aggregation.rs: adopt build_multi_message_input_data name,
  keep DIGEST_LEN-generic layout comment.
- zkdsl main.py: adopt single/multi-message naming and main's direct-ingestion
  reduce_bytecode_claims, but keep Goldilocks DIGEST_LEN-generic copy_digest
  loops (instead of main's hardcoded copy_8/copy_32) and SINGLE_MESSAGE flag
  placeholders provided by compilation.rs.
- zkdsl hashing.py: slice_hash_continue uses poseidon8_permute_half (not the
  KoalaBear poseidon16 variant that auto-merged in).

Validated: workspace builds, fmt+clippy clean, cargo testall passes, and
end-to-end recursion (n=2) proves+verifies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…idon1-8

Implemented and measured the Appendix-B sparse partial-round decomposition
(the same one the AIR/trace-gen and the KoalaBear-16 permutation use) in the
AVX-512 permutation. It is ~13% slower for Goldilocks: this circulant MDS has
tiny entries {1,3,4,7,8,9} that strength-reduce to shifts/adds and batch 8
terms into a single reduce128 per output, while the sparse form needs
arbitrary-constant 64x64 multiplies (one reduce128 each → 15 vs 8 reductions
per partial round). Reverted the implementation, kept a comment so the dead
end isn't re-explored.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cubic_mul_generic is the hottest field op in the prover after the poseidon
permutation (~15% of an xmss prove, via sumcheck eq-eval and the poseidon AIR
constraint eval). On Goldilocks each multiply carries a 128->64-bit reduction
(the dominant cost), so 3-term Karatsuba trading 3 of the 9 multiplies for
cheap field adds/subs is a net win across all packed backends.

Measured: xmss --n-signatures 1550 --log-inv-rate 1 goes 392-394 -> 400-402
XMSS/s (~2%). Verified against the schoolbook reference (10k scalar + 2k packed
random inputs) and end-to-end recursion still proves+verifies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant