Skip to content

Add View::with_typed_arrays API#3165

Merged
texodus merged 2 commits intomasterfrom
context-zero-perf
Apr 23, 2026
Merged

Add View::with_typed_arrays API#3165
texodus merged 2 commits intomasterfrom
context-zero-perf

Conversation

@texodus
Copy link
Copy Markdown
Member

@texodus texodus commented Apr 23, 2026

This PR mainly add View.with_typed_arrays, a new zero-copy JS API(febaf25f5):

  • New with_typed_arrays(window, callback) on the JS View. Calls callback(names, values, validities, dictionaries) with zero-copy TypedArray views over the Arrow buffers. Numeric columns map 1:1 to Int32Array/Uint32Array/Float32Array/Float64Array; Dictionary<Int32, Utf8> columns surface as (Int32Array keys, string[] values) pairs; validity bitmaps are Uint8Array views. Arrays are only valid inside the callback.
  • float32 option downcasts Float64/Date32/Timestamp/Int64 columns to Float32Array for half-memory GPU uploads.
  • New ViewWindow.emit_legacy_row_path_names, when false, group-by columns are named __ROW_PATH_N__ to match the SQL backend; with_typed_arrays forces this off. Wired through protobuf (ViewPort.emit_legacy_row_path_names) and View::to_arrow. This is helpful to make sense of which columns are which, if you don't need them to be human readable.

As a drive-by, the internals of View registration have been optimized for memory and CPU performance, originally an experient to share state between Table and View when their columns would otherwise be identical (e.g. ctx0), that spilled over into all View loading and glanced CSV loading as well:

  • t_ctx0/t_ctx1/t_ctx2/t_ctxunit - skip populating m_delta_pkeys (saves a hopscotch_set insert per row).
  • t_ctx0: new bulk-load fast path for unsorted/unfiltered/empty traversal — appends pkeys directly into m_index via new t_ftrav::bulk_load_reserve/append/finalize, skipping the m_new_elems hopscotch_map round-trip. t_ftrav::step_end now short-circuits when no step work happened.
  • New Table::init_bulk API that aliases shared_ptr<t_column>s into the master table instead of deep-cloning. Preconditions: empty gstate, all OP_INSERT, unique psp_pkey.
  • Table::from_csv uses the new bulk path when the index is implicit (no explicit index, no __INDEX__); otherwise falls back to the flatten-and-merge path, since those allow duplicate pkeys.
  • fill_master_table / update_master_table now take shared_ptr<t_data_table> and alias columns (zero-copy) instead of clone()-ing each column.

@texodus texodus changed the title Context zero perf Add View::with_typed_arrays API Apr 23, 2026
@texodus texodus added the enhancement Feature requests or improvements label Apr 23, 2026
@texodus texodus force-pushed the context-zero-perf branch from a199e5d to f13cccf Compare April 23, 2026 01:32
texodus added 2 commits April 22, 2026 22:41
Signed-off-by: Andrew Stein <steinlink@gmail.com>
Signed-off-by: Andrew Stein <steinlink@gmail.com>
@texodus texodus force-pushed the context-zero-perf branch from f13cccf to 0b47162 Compare April 23, 2026 02:46
@texodus texodus marked this pull request as ready for review April 23, 2026 03:22
@texodus texodus merged commit 8742c63 into master Apr 23, 2026
14 checks passed
@texodus texodus deleted the context-zero-perf branch April 23, 2026 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Feature requests or improvements

Development

Successfully merging this pull request may close these issues.

1 participant