Skip to content

feat(argoverse2): add Argoverse 2 Sensor Dataset to NCore V4 converter#145

Open
janickm wants to merge 1 commit into
NVIDIA:mainfrom
janickm:fresh-salamander
Open

feat(argoverse2): add Argoverse 2 Sensor Dataset to NCore V4 converter#145
janickm wants to merge 1 commit into
NVIDIA:mainfrom
janickm:fresh-salamander

Conversation

@janickm

@janickm janickm commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Add a converter for the Argoverse 2 (AV2) Sensor Dataset to NCore V4, closing the Argoverse 2 half of #123 (the nuScenes half landed in #128).

The converter reads the AV2 on-disk Apache Feather files directly with pyarrow (Arrow tables, no pandas) and deliberately avoids the heavy av2 devkit (which pulls in torch, kornia, numba, polars, PyAV). The only added dependency is pyquaternion, already used in this package.

Closes #123

Sensors

  • Cameras: all 9 global-shutter cameras (7 ring + 2 stereo). AV2 imagery is shipped already undistorted -- the official av2 devkit projects with the intrinsic matrix K only and does not load the k1, k2, k3 columns -- so the stored model is an ideal (distortion-free) pinhole (IdealPinholeCameraModelParameters, ShutterType.GLOBAL). The k1, k2, k3 in intrinsics.feather describe the original lens (for re-distorting into the raw frame) and are intentionally not applied. Note: the stereo pair is genuinely monochrome (R=G=B in the source JPEGs).
  • Lidar: the two stacked Velodyne VLP-32C units (up_lidar / down_lidar) are stored separately, each with its own extrinsic. AV2 sweeps are egomotion-compensated to the sweep reference timestamp (the sweep start) and expressed in the egovehicle frame, with real per-point timestamps (offset_ns). Points are mapped into each unit's sensor frame and decompensated -- referenced to the sweep start -- using the shared MotionCompensator reference_timestamp_us, so NCore stores raw per-point-time directions.
  • Radar: AV2 has none.
  • Cuboids: native to the egovehicle frame at the sweep time; stored in the rig frame at that timestamp with no ego pose baked in, so egomotion stays swappable downstream (a V4 feature).

Lidar unit split

AV2 shares one laser_number range [0, 63] across both units with no documented up/down mapping. The two halves (< 32 / >= 32) are the two physical units; the up/down label is recovered from extrinsic geometry by per-beam elevation flatness (a laser ring traces a constant-elevation cone only in its own sensor frame), decided once per log with a wide, stable margin.

Structured VLP-32C lidar model

AV2 provides no native firing-column index, but offset_ns + laser_number reconstruct the firing pattern (offset_ns -> firing column at 10 Hz, laser_number -> beam/row). A structured model is derived per unit and stored as intrinsics with per-point model_element. Reaching native-sensor accuracy required several steps, each found by evaluating with ncore_evaluate_lidar_model across many logs:

  1. Decompensate first. The geometry is derived from the decompensated reference sweep; the raw motion-compensated cloud is azimuth-smeared by ego motion (~0.5 deg).
  2. Empirical per-row azimuth offsets. The 32 beams of a firing column span ~8.5 deg of azimuth (not co-azimuthal), so per-row offsets are fit empirically (jointly with per-column azimuths), not assumed from an HDL-32E pattern.
  3. Per-unit spin direction. The two stacked units fire in opposite phase and so spin oppositely in their own frames (one cw, one ccw); detected from the data.
  4. One-revolution column wrap. An AV2 sweep covers ~1.02 revolutions, so columns are sized to one revolution and wrapped modulo, keeping the azimuth ramp below 2*pi.
  5. Per-frame affine alignment. The model is static, but the spin phase at a given offset_ns drifts ~1 deg between sweeps and the spin rate drifts slightly within a sweep on some scenes. Each frame is re-aligned by an affine column remap (constant phase + linear term).
  6. 4x column upsampling so the per-frame alignment is not column-quantized.
  7. Near-range offset fallback for steep downward beams that only return at near range (e.g. the lowest laser at ~-25 deg, with no far data).

A --lidar-model-source {empirical,none} flag (default empirical) gates emission.

Validation

Converted 38 val logs (76 lidar units) and evaluated each with ncore_evaluate_lidar_model:

Metric mean p90 max
median far-range error 0.033 deg 0.055 deg 0.077 deg
azimuth bias (signed) 0.005 deg -- 0.022 deg
elevation bias (signed) 0.005 deg -- 0.026 deg

All 76 units are sub-0.08 deg median with no systematic azimuth or elevation bias -- on par with native-column sensors. (Isolated per-point maxima of 1-16 deg are dynamic objects / steepest grazing-beam returns, which no static spin model can represent; per-frame p95 stays ~0.08 deg.)

Coordinate frames

The first ego pose's city_SE3_egovehicle is stored as the static world -> world_global anchor, so world_global is the AV2 city frame and absolute coordinates stay recoverable for later HD-map alignment.

Testing

# Data-free unit tests (run in CI)
bazel test //tools/data_converter/argoverse2:pytest_utils

# Integration test (requires a downloaded AV2 log)
AV2_DIR=/path/to/argoverse2/sensor AV2_SPLIT=val \
    bazel test //tools/data_converter/argoverse2:pytest_converter
  • pytest_utils is a data-free unit test for the VLP-32C model derivation: reconstruction accuracy on a synthetic sweep, the >1-revolution column wrap, cross-frame phase generalization, and intra-sweep rate drift. Each guard was verified to fail when its fix is reverted.
  • pytest_converter (manual / AV2_DIR-gated) validates the full pipeline against a real val log, including a lidar-vs-cuboid alignment check and the model-reconstruction accuracy guard.

Builds on the merged MotionCompensator.reference_timestamp_us / anchor_frame_id work (#146, #147) and the lidar-model robustness fixes (#148).

@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@janickm janickm self-assigned this Jun 10, 2026
@janickm janickm force-pushed the fresh-salamander branch 4 times, most recently from 664fe98 to deffc27 Compare June 13, 2026 11:42
@janickm janickm marked this pull request as ready for review June 13, 2026 16:17
@janickm janickm force-pushed the fresh-salamander branch 3 times, most recently from cd621f7 to 333a0d0 Compare June 16, 2026 11:26
@janickm janickm added the enhancement New feature or request label Jun 16, 2026
@janickm janickm force-pushed the fresh-salamander branch 5 times, most recently from 9688750 to d4baeb8 Compare June 16, 2026 15:20
Add a converter for the Argoverse 2 (AV2) Sensor Dataset, closing the
Argoverse 2 half of NVIDIA#123 (the nuScenes half landed in NVIDIA#128).

The converter reads the AV2 on-disk Apache Feather files directly with
pyarrow (Arrow tables, no pandas) and deliberately avoids the heavy av2
devkit (torch, kornia, numba, polars, PyAV). The only added dependency is
pyquaternion, already used in this package.

Sensors:
- Cameras: all 9 global-shutter cameras (7 ring + 2 stereo). AV2 imagery is
  shipped undistorted, so the stored model is pinhole with zero distortion
  and ShutterType.GLOBAL.
- Lidar: the two stacked Velodyne VLP-32C units (up_lidar / down_lidar) are
  stored separately, each with its own static extrinsic. AV2 sweeps are
  egomotion-compensated to the sweep reference timestamp (the sweep start)
  and expressed in the egovehicle frame, with real per-point timestamps
  (offset_ns). Points are mapped into each unit's sensor frame and
  decompensated -- referenced to the sweep start -- using the shared
  MotionCompensator reference_timestamp_us, so NCore stores raw
  per-point-time directions.
- Radar: AV2 has none.
- Cuboids: native to the egovehicle frame at the sweep time; stored in the
  rig frame at that timestamp with no ego pose baked in, so egomotion stays
  swappable downstream (a V4 feature).

Lidar unit split: AV2 shares one laser_number range [0,63] across both
units with no documented up/down mapping. The two halves (<32 / >=32) are
the two units; the up/down label is recovered from extrinsic geometry by
per-beam elevation flatness (a ring is a constant-elevation cone only in
its own sensor frame), stable per log with a wide margin.

Structured VLP-32C model: AV2 provides no native firing-column index, but
offset_ns + laser_number reconstruct it (offset_ns -> firing column at
10 Hz, laser_number -> beam/row). A structured model is derived per unit
and stored as intrinsics with per-point model_element. Two things are
essential for native-sensor accuracy: (1) the geometry is derived from the
decompensated reference sweep (the compensated cloud is azimuth-smeared by
ego motion ~0.5 deg); (2) per-row azimuth offsets are fit empirically (the
32 beams of a column span ~8.5 deg, so they are not co-azimuthal). The two
units fire in opposite phase and thus spin oppositely in their own frames
(one cw, one ccw), detected from the data. Result: ~0.0 deg median /
~0.02 deg p95 far-range reconstruction. --lidar-model-source {nominal,none}
gates emission.

Coordinate frames: the first ego pose's city_SE3_egovehicle is stored as
the static world -> world_global anchor, so world_global is the AV2 city
frame and absolute coordinates stay recoverable for later HD-map alignment.

Includes an AV2_DIR-gated integration test (validated against a real val
log), docs, and dependency wiring.
@janickm janickm force-pushed the fresh-salamander branch from d4baeb8 to f6ed44b Compare June 16, 2026 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nuScenes and Argoverse 2 support

1 participant