Describe the bug
generate_series and range panic with capacity overflow when given an integer range so large the count exceeds isize::MAX bytes. The panic comes from Vec::reserve inside the integer-range implementation, hit during planning (constant folding of the table-valued function).
To Reproduce
use datafusion::prelude::SessionContext;
#[tokio::main]
async fn main() {
let ctx = SessionContext::new();
let _ = ctx
.sql("SELECT generate_series(0, 9223372036854775807)")
.await
.unwrap()
.create_physical_plan()
.await;
}
Panic:
thread 'main' panicked at .../alloc/src/raw_vec/mod.rs:28:5:
capacity overflow
Also reproduces with:
SELECT range(0, 9223372036854775807)
SELECT range(9223372036854775807)
SELECT generate_series(-9223372036854775808, 9223372036854775807)
Bounded ranges like SELECT generate_series(1, 100) are fine.
Expected behavior
Return a planning/execution error along the lines of "range too large to materialize" (or, ideally, a streaming implementation that does not need to materialize the full sequence eagerly). The public SQL API should never panic on user-supplied SQL.
Root cause
datafusion/functions-nested/src/range.rs, in generate_range_values:
// line 563-565 (step > 0 branch)
let count =
(start.abs_diff(limit) / step.unsigned_abs()).saturating_add(1) as usize;
values.reserve(count); // ← panics here
// line 583-585 (step < 0 branch — identical pattern)
let count =
(start.abs_diff(limit) / step.unsigned_abs()).saturating_add(1) as usize;
values.reserve(count);
For generate_series(0, i64::MAX, 1) the count is ~u64::MAX/8 (after saturating_add(1)), which on a 64-bit target turns into a usize of ~9.2 × 10^18. Vec::<i64>::reserve multiplies by size_of::<i64>() = 8, sees that exceeds isize::MAX, and panics.
Suggested fix
Bound count at allocation time:
const MAX_RANGE_ELEMENTS: usize = isize::MAX as usize / std::mem::size_of::<i64>();
if count > MAX_RANGE_ELEMENTS {
return exec_err!(
"range too large: would produce {count} elements (max {MAX_RANGE_ELEMENTS})"
);
}
values.reserve(count);
A friendlier limit (say, 1 GiB / 8 B = 128 M elements, configurable) would also stop this from being a memory-exhaustion DoS.
Additional context
Found by a cargo fuzz target (fuzz/fuzz_targets/sql_physical_plan.rs) seeded with SQL extracted from datafusion/sqllogictest/test_files/. The fuzzer triggered it from a mutated generate_series example by replacing a small numeric literal with 9223372036854775807 (i64::MAX).
Describe the bug
generate_seriesandrangepanic withcapacity overflowwhen given an integer range so large the count exceedsisize::MAXbytes. The panic comes fromVec::reserveinside the integer-range implementation, hit during planning (constant folding of the table-valued function).To Reproduce
Panic:
Also reproduces with:
SELECT range(0, 9223372036854775807)SELECT range(9223372036854775807)SELECT generate_series(-9223372036854775808, 9223372036854775807)Bounded ranges like
SELECT generate_series(1, 100)are fine.Expected behavior
Return a planning/execution error along the lines of "range too large to materialize" (or, ideally, a streaming implementation that does not need to materialize the full sequence eagerly). The public SQL API should never panic on user-supplied SQL.
Root cause
datafusion/functions-nested/src/range.rs, ingenerate_range_values:For
generate_series(0, i64::MAX, 1)thecountis ~u64::MAX/8(aftersaturating_add(1)), which on a 64-bit target turns into ausizeof ~9.2 × 10^18.Vec::<i64>::reservemultiplies bysize_of::<i64>() = 8, sees that exceedsisize::MAX, and panics.Suggested fix
Bound
countat allocation time:A friendlier limit (say, 1 GiB / 8 B = 128 M elements, configurable) would also stop this from being a memory-exhaustion DoS.
Additional context
Found by a
cargo fuzztarget (fuzz/fuzz_targets/sql_physical_plan.rs) seeded with SQL extracted fromdatafusion/sqllogictest/test_files/. The fuzzer triggered it from a mutatedgenerate_seriesexample by replacing a small numeric literal with9223372036854775807(i64::MAX).