BEAM-native JS engine and compiler by dannote · Pull Request #5 · elixir-volt/quickbeam

dannote · 2026-04-15T10:37:55Z

Adds a second QuickJS execution backend on the BEAM.

What’s in here

QuickJS bytecode decoder in Elixir
interpreter for QuickJS bytecode on the BEAM
hybrid compiler from QuickJS bytecode to BEAM modules
raw BEAM disassembly for the :beam backend via QuickBEAM.disasm/2
mode: :beam support in the public API
require(), module loading, dynamic import, globals, handlers, and interop for the VM path
stack traces, source positions, and Error.captureStackTrace

Runtime coverage

Object, Array, Function, String, Number, Boolean
Math, JSON, Date, RegExp
Map, Set, WeakMap, WeakSet, Symbol
Promise, async/await, generators, async generators
Proxy, Reflect
TypedArray, ArrayBuffer, BigInt
classes, inheritance, super, private fields, private methods, private accessors, static private members, brand checks

Validation

QUICKBEAM_BUILD=1 MIX_ENV=test mix test
MIX_ENV=test QUICKBEAM_BUILD=1 mix test test/vm/js_engine_test.exs --include js_engine --seed 0
mix compile --warnings-as-errors
mix format --check-formatted
mix credo --strict
mix dialyzer
mix ex_dna
zlint lib/quickbeam/*.zig lib/quickbeam/napi/*.zig
bunx oxlint -c oxlint.json --type-aware --type-check priv/ts/
bunx jscpd lib/quickbeam/*.zig priv/ts/*.ts --min-tokens 50 --threshold 0

Current local result:

2363 tests, 0 failures, 1 skipped, 54 excluded

Change shape representation from {:shape, shape_id, vals, proto} to {:shape, shape_id, offsets, vals, proto}. The offsets map is inlined from the shape table, eliminating the Process.get + Map.fetch! chain on every property read. Get.get shape hit: 775ns → 648ns (16% faster). ~380us saved per render at 3000 property reads.

…s.lookup fprof showed Shapes.get_shape (54K calls, 108ms) as the #1 bottleneck. It was still called from Put.put via Shapes.lookup on the hot path. Replace Shapes.lookup(shape_id, key) with Map.fetch(offsets, key) in shape_put, Store.put_obj_key, and put/3 for length. Get.get: 648ns → 406ns (37% faster) Preact render: 6.95ms → 6.55ms (5.8% faster)

…calls transition/2 now returns {shape_id, offsets, offset} instead of {shape_id, offset}. Callers no longer need a separate get_shape call to fetch the new shape's offsets after transition. Eliminates ~13K redundant shape table lookups per render.

Transition cache now stores {child_id, child_offsets} instead of just child_id. Eliminates get_shape(child_id) on every cache hit. get_shape calls per render: 27,930 → 14,819 (verified via fprof). Preact render: 6.55ms → 6.2ms.

The grow path (adding a property to a shape-backed object) was doing: Tuple.to_list → ++ List.duplicate → ++ [val] → List.to_tuple (1693ns) Now uses :erlang.append_element for the common case where offset == tuple_size (sequential property addition): 217ns — 8× faster. Preact render: 6.1ms → 5.4ms (12% faster).

Heap.frozen? now checks a global :qb_has_frozen flag before doing per-object Process.get. In Preact SSR (where nothing is frozen), this eliminates 13,552 process dictionary lookups per render. Preact render: 5.37ms → 5.25ms.

Detect 'object; (push_val; define_field)* ...' patterns during lowering and batch them into a single Heap.wrap(%{k1 => v1, k2 => v2, ...}). Eliminates ~10K individual Put.put calls per Preact render (881 VNodes × ~12 batched fields each). Each Put.put was doing: Process.get + shape transition + put_val + Process.put. Supported value opcodes: integer literals, null, undefined, booleans, get_arg, get_loc, push_atom_value, empty string. Falls back to individual define_field for values that can't be lowered at compile time (function calls, computed values). Preact render: 5.25ms → 4.4ms (16% faster).

Preact's h() function has 5 args, called 881 times per render. The generic Enum.take path was 14.5ms (8,504 calls). Direct pattern matching eliminates the list traversal overhead.

Replaces Enum.with_index + Enum.reduce with direct :maps.from_list for shape-to-map reconstruction.

Replace make_ref() + {:qb_obj, ref} tuple keys with monotonic integer counter + raw integer keys in the process dictionary. From the OTP JIT source (erl_process_dict.c), the hash function for small integers is just 'unsigned_val(Term)' — essentially free. Tuple keys go through the full erts_internal_hash which is much more expensive. EQ comparison is also a single pointer compare for integers vs deep tuple comparison for {:qb_obj, ref}. Measured: Process.put with raw integer keys is 2× faster than tuple keys. Process.get is 1.35× faster. Preact render: 4.3ms → 3.55ms (17% faster). Total session: 6.95ms → 3.55ms (49% faster).

- Replace List.zip with manual keys_vals_to_map recursion (1.6× faster) - Replace PD counter with :erlang.unique_integer (2.6× faster) Preact render: 3.55ms → 3.4ms.

Two changes: 1. inline_get_var_ref emits get_capture(ctx, {type, var_idx}) instead of get_var_ref(ctx, integer_idx). The capture key is resolved at compile time from closure_vars, eliminating Enum.at list traversal. 2. current_var_ref caches closure_vars as a tuple of capture keys per function (keyed on byte_code binary). elem(tuple, idx) is O(1) vs Enum.at(list, idx) which is O(n). Before: 1,609 Enum.at calls at 20.6ms total (12.8us each). After: 0 Enum.at calls. get_capture is 3.0us per call.

Shape IDs are contiguous integers 0..N. Storing shapes in a tuple and using elem(table, id) eliminates Map.fetch! overhead entirely. put_shape appends via :erlang.append_element for new shapes and uses put_elem for updates. Also eliminates the separate :qb_shape_next_id counter — the next ID is simply tuple_size(table). 5,565 get_shape calls/render × ~80ns savings = ~157us per render. Preact render: 3.50ms → 3.34ms.

The compiler now emits Heap.wrap_keyed(keys_tuple, vals_tuple) instead of Heap.wrap(%{k1 => v1, ...}) for batched object literals. wrap_keyed uses the keys tuple (a compile-time constant) as a cache key to look up the pre-resolved shape. On cache hit, it skips: - :erlang.phash2 of the key set - Shapes.from_map shape resolution - Map construction from keys/values This eliminates ~103ns per object creation at 2,399 objects/render. Preact render: 3.43ms → 3.25ms (5.2% faster).

Instead of emitting RuntimeHelpers.get_var(ctx, name) which traverses: get_var → fetch_ctx_var → context_globals → GlobalEnv.fetch → Map.fetch Emit :erlang.map_get(:globals, ctx) at compile time to extract the globals map, then call get_global(globals, name) which does a single Map.fetch. Eliminates 3 function calls per global variable access (2,644 calls per Preact render). Preact render: 3.25ms → 3.18ms.

The op_eq helper now has: 1. {Same, Same} → true (identity check, covers 79% of comparisons) 2. Number guards → == (existing) 3. Binary guards → == (string comparison without Values.eq dispatch) 4. Fallback → Values.eq (handles null/undefined cross-equality) Values.eq calls: 5,175 → 1,075 per render.

Skip resetting home_object and super fields in the fast path for closures with need_home_object: false. These closures by definition don't use home objects, so the fields can inherit from the parent context safely. Saves 2 map update operations per function call (2,010 calls/render).

put_field skips normalize_key (key is known binary at compile time), frozen? check (not needed for just-created objects), __proto__ check (key is known not to be __proto__), and Heap.get_obj_raw delegation (calls Process.get directly). The compiler emits put_field for define_field opcodes where the key is a resolved string literal. Eliminates ~3 function calls and 2 guards per property write for 4,298 define_field ops per Preact render.

Generate op_get_field/2 as a local function in each compiled BEAM module. The fast path does: 1. Pattern match {:obj, Id} 2. erlang:get(Id) — direct BIF call, no delegation chain 3. Pattern match {:shape, _, Offsets, Vals, _} 4. maps:find(Key, Offsets) — direct BIF 5. element(Off+1, Vals) — direct tuple access Fallback to Get.get for non-shape objects, prototype chain, etc. This eliminates 4 cross-module function calls on the hot path: Get.get → get_own → Heap.get_obj_raw → Store.get_obj_raw → Process.get The JIT can optimize local calls much better than cross-module calls (no export table indirection, better branch prediction). Preact render: 3.23ms → 3.09ms (143us, 4.4% faster).

When the compiler creates an object via wrap_keyed (batched object literal), it records the offsets map as part of the stack type info ({:shaped_object, offsets}). For subsequent get_field on the same variable, if the offset is known at compile time, emit direct element(Off+1, Vals) bypassing maps:find entirely. Also propagates shape info through define_field ops — each property addition extends the known offsets map. This enables V8-style monomorphic inline caching for same-block property access patterns.

These functions previously called Heap.get_obj which triggers to_map reconstruction (149ns per call for 5-key shapes). Now they check for shape-backed objects first and use the shape's keys list or offsets map directly, avoiding the full map reconstruction. - enumerable_keys: uses Shapes.keys(shape_id) instead of to_map - enumerable_string_props: uses Shapes.to_map only for the shape case (without proto overhead), bypasses get_obj delegation - length_of: reads 'length' directly from shape offsets Eliminates ~1000+ to_map reconstructions per render. Preact render: 3.17ms → 3.03ms (137us, 4.3% faster).

Eliminate 2 delegation calls (Heap.put_obj_raw → Store.put_obj_raw) and 1 function call (Shapes.put_val) for the common case where offset is within the existing tuple size. For the transition path, inline :erlang.append_element for the sequential-append case. 3,857 shape_put calls per render.

Generate op_truthy/1 and op_typeof/1 as local functions in each compiled BEAM module. These are pure pattern-matching functions that benefit from local call dispatch: - op_truthy: handles nil/undefined/false/0/0.0/empty-string fast paths inline, eliminating 1,867 cross-module calls per render - op_typeof: handles undefined/null/boolean/number/string inline, falls back to Values.typeof for complex types (883 calls) Also wire branch_condition to use op_truthy instead of Values.truthy?.

Runs test262 test suites through NIF, compiler, and interpreter modes and compares pass rates. Requires test262 to be checked out at ../quickjs/test262/. Current results (subset): QuickJS NIF: 99.88% (79,827 tests, 98 errors) BEAM compiler: 80-100% depending on category BEAM interpreter: 80-100% (identical to compiler) Compiler-specific failures: BigInt literal handling (pre-existing)

- lib/quickbeam/vm/compiler/diagnostics.ex: check/1, explain/1, helper_call_counts/1 - bench/compiler_vs_interpreter.exs: compiler vs interpreter on six JS patterns - Complete lowering tuple refactor (list→tuple for O(1) instruction access) - Apply analysis tuple refactor to cfg.ex, stack.ex, types.ex

- Fix bench/compiler_vs_interpreter.exs Heap.reset issue - Targeted benchmarks show compiler ~= interpreter on all patterns - String concat: 8.42µs (compiler), 9.17µs (interpreter) - Numeric loop: 61.33µs, 59.17µs - Array loop: 201.83µs, 207.50µs - Object field: 615.29µs, 614.04µs - Function call: 1668.50µs, 1687.06µs - Closure: 2385.31µs, 2408.00µs

…teger types Compiler now emits direct BEAM arithmetic/bitwise ops when type analysis proves operands are integers, bypassing Values.* runtime dispatch. Specialized: mod→rem, band→band, bor→bor, bxor→bxor, shl→bsl, sar→bsr.

Variables, Objects, Functions, Iterators, Coercion — RuntimeHelpers is now a thin defdelegate facade. Public API unchanged.

…storage abstraction

…odules back, drop unused Heap delegates

…ties/1, add differential tests

… table instability

Generated BEAM code now has guard-based integer AND float fast paths for add, mul, neg, lt/lte/gt/gte, div (with b!=0 guard), mod (with b!=0 guard). BEAM JIT can optimize these to native arithmetic without bouncing through Values.* module calls. Fixed latent crash: op_div and op_mod guards now prevent Erlang crash on division/mod by zero (JS returns Infinity/NaN, not crash).

Type analysis now tracks {:shaped_object, offsets, value_map} for object literals with constant values. get_field_call inlines the constant value directly when the key is known and the value is pure, bypassing heap access. object_field benchmark: 615µs → 5.58µs (110× faster) numeric_loop benchmark: 60µs → 5.92µs (10× faster, from post_inc inline)

Result: {"status":"keep","failing_tests":32,"passing_tests":0}

…RL, URLSearchParams, atob/btoa, setTimeout, Headers, AbortController, performance, Blob, crypto, fetch/Request/Response)

…coder, URL, atob/btoa, setTimeout, Headers, AbortController, performance, Blob, crypto, fetch/Request/Response as native builtins. Result: {"status":"keep","failing_tests":0,"passing_tests":32}

- TextEncoding: TextEncoder/TextDecoder - URL: URL/URLSearchParams - Encoding: atob/btoa - Timers: setTimeout/clearTimeout/setInterval/clearInterval - Headers: Headers (shared build_from_map/1 used by Fetch) - Abort: AbortController - Performance: performance object using build_object macro - Blob: Blob - Crypto: crypto object using build_object macro - Fetch: fetch/Request/Response web_apis.ex is now a thin aggregator that merges all bindings/0. 32/32 beam_web_apis tests passing.

…acros, fix heap GC for mutable state

…54 failing)

Result: {"status":"keep","failing_tests":54,"passing_tests":38}

- TextEncoder: add encodeInto, fix WTF-8 lone surrogate handling - TextDecoder: fix UTF-8 decoding, fatal mode, BOM stripping, ArrayBuffer support - atob/btoa: type coercion, whitespace stripping, proper base64 handling - crypto.getRandomValues: fix zero-length bug, add >65536 TypeError - performance.now: return positive milliseconds relative to origin - queueMicrotask: TypeError for non-function, silently discard errors - structuredClone: deep clone objects, arrays, Date, RegExp, Map, Set, ArrayBuffer - timers: implement timer macro queue with actual callback execution - Promise constructor: properly call executor with resolve/reject - String spread: fix [..str] operator to iterate codepoints - top-level await: wrap code in async IIFE for eval_beam - Map constructor: fix initialization from array-of-arrays (qb_arr elements) - instanceof: add auto_proto for Date, RegExp, Map, Set, ArrayBuffer - get_prototype_raw: check type-specialized methods before following proto chain

Result: {"status":"keep","failing_tests":0,"passing_tests":92}

- Timers: raw Process.get/put → Heap.Caches wrappers - encoding.ex: delete duplicate coerce_to_string, use Values.stringify - url.ex: raw throw → JSThrow.type_error! - structured_clone.ex: deep_clone(:undefined) returns :undefined not nil - Fix credo issues (number underscores, alias ordering, Map.new)

Result: {"status":"keep","failing_tests":171,"passing_tests":140}

dannote force-pushed the beam-vm-interpreter branch from 0eb3475 to 7c1c574 Compare April 15, 2026 14:06

dannote changed the title ~~BEAM-native JS interpreter (Phase 0-1)~~ BEAM-native JS interpreter Apr 16, 2026

dannote marked this pull request as ready for review April 16, 2026 08:41

dannote force-pushed the beam-vm-interpreter branch 2 times, most recently from 75fdba5 to 527d5b9 Compare April 20, 2026 08:45

dannote changed the title ~~BEAM-native JS interpreter~~ BEAM-native JS engine and compiler Apr 21, 2026

dannote added 24 commits April 22, 2026 22:57

Cache child shape offsets in transitions map

eb4e65d

Transition cache now stores {child_id, child_offsets} instead of just child_id. Eliminates get_shape(child_id) on every cache hit. get_shape calls per render: 27,930 → 14,819 (verified via fprof). Preact render: 6.55ms → 6.2ms.

Skip frozen check when no objects have been frozen

d08db31

Heap.frozen? now checks a global :qb_has_frozen flag before doing per-object Process.get. In Preact SSR (where nothing is frozen), this eliminates 13,552 process dictionary lookups per render. Preact render: 5.37ms → 5.25ms.

Add normalize_args fast paths for arity 4 and 5

0c6a45e

Preact's h() function has 5 args, called 881 times per render. The generic Enum.take path was 14.5ms (8,504 calls). Direct pattern matching eliminates the list traversal overhead.

Optimize to_map reconstruction with :maps.from_list + List.zip

13dd891

Replaces Enum.with_index + Enum.reduce with direct :maps.from_list for shape-to-map reconstruction.

Optimize to_map and ID allocation

d9d0bd6

- Replace List.zip with manual keys_vals_to_map recursion (1.6× faster) - Replace PD counter with :erlang.unique_integer (2.6× faster) Preact render: 3.55ms → 3.4ms.

dannote added 30 commits April 25, 2026 12:02

Specialize inc/dec for integer type; fix block_successors tuple_size

9690ff6

Split RuntimeHelpers into focused submodules

59302c2

Variables, Objects, Functions, Iterators, Coercion — RuntimeHelpers is now a thin defdelegate facade. Public API unchanged.

Split lowering State into Stack, Slots, Calls submodules

cd868d3

Centralize Process.get/put calls, semantic module fingerprint, array …

7d475ff

…storage abstraction

Merge RuntimeHelpers and State submodules back into main files

38798f5

Remove pointless defdelegate facades: merge RuntimeHelpers/State subm…

caf0f74

…odules back, drop unused Heap delegates

Fix op_mod specialization, add State struct, add Diagnostics.capabili…

a078fd5

…ties/1, add differential tests

Revert semantic fingerprinting — causes benchmark crashes due to atom…

9602354

… table instability

Inline integer post_inc/post_dec in compiler lowering

69211f3

Set up autoresearch for BEAM web API builtins (32 tests, 0 passing)

fd19646

Baseline: 32 failing, 0 passing. No web APIs available in BEAM mode.

fb02521

Result: {"status":"keep","failing_tests":32,"passing_tests":0}

Baseline: 0 web APIs implemented, all 32 tests failing

e4f0978

Result: {"status":"keep","failing_tests":32,"passing_tests":0}

Implement web API builtins for BEAM mode (TextEncoder, TextDecoder, U…

b417c47

…RL, URLSearchParams, atob/btoa, setTimeout, Headers, AbortController, performance, Blob, crypto, fetch/Request/Response)

All 32 web API tests passing in BEAM mode. Implemented TextEncoder/De…

5a82b55

…coder, URL, atob/btoa, setTimeout, Headers, AbortController, performance, Blob, crypto, fetch/Request/Response as native builtins. Result: {"status":"keep","failing_tests":0,"passing_tests":32}

Remove autoresearch files (web API builtins complete)

19f90e6

Refactor Web API modules: deduplicate register/2, use build_methods m…

ab7942e

…acros, fix heap GC for mutable state

Set up autoresearch: full web API test suite in beam mode (92 tests, …

72990f2

…54 failing)

Baseline: 54 failing, 38 passing out of 92 web API tests in beam mode

eb0c7fc

Result: {"status":"keep","failing_tests":54,"passing_tests":38}

Fixed all 54 failing beam web API tests - 92/92 passing

c935ee0

Result: {"status":"keep","failing_tests":0,"passing_tests":92}

Remove autoresearch files (web API builtins complete: 92/92 passing)

a4c00a9

Set up autoresearch: full web API suite (311 tests, 171 failing)

186de05

Baseline: 171 failing, 140 passing out of 311 web API tests

48b167b

Result: {"status":"keep","failing_tests":171,"passing_tests":140}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BEAM-native JS engine and compiler#5

BEAM-native JS engine and compiler#5
dannote wants to merge 628 commits intomasterfrom
beam-vm-interpreter

dannote commented Apr 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dannote commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What’s in here

Runtime coverage

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dannote commented Apr 15, 2026 •

edited

Loading