feat(argoverse2): add Argoverse 2 Sensor Dataset to NCore V4 converter#145
Open
janickm wants to merge 1 commit into
Open
feat(argoverse2): add Argoverse 2 Sensor Dataset to NCore V4 converter#145janickm wants to merge 1 commit into
janickm wants to merge 1 commit into
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
This was referenced Jun 10, 2026
feat(transformations): support custom reference timestamp in motion (de)compensation
janickm/ncore#1
Closed
664fe98 to
deffc27
Compare
cd621f7 to
333a0d0
Compare
9688750 to
d4baeb8
Compare
Add a converter for the Argoverse 2 (AV2) Sensor Dataset, closing the Argoverse 2 half of NVIDIA#123 (the nuScenes half landed in NVIDIA#128). The converter reads the AV2 on-disk Apache Feather files directly with pyarrow (Arrow tables, no pandas) and deliberately avoids the heavy av2 devkit (torch, kornia, numba, polars, PyAV). The only added dependency is pyquaternion, already used in this package. Sensors: - Cameras: all 9 global-shutter cameras (7 ring + 2 stereo). AV2 imagery is shipped undistorted, so the stored model is pinhole with zero distortion and ShutterType.GLOBAL. - Lidar: the two stacked Velodyne VLP-32C units (up_lidar / down_lidar) are stored separately, each with its own static extrinsic. AV2 sweeps are egomotion-compensated to the sweep reference timestamp (the sweep start) and expressed in the egovehicle frame, with real per-point timestamps (offset_ns). Points are mapped into each unit's sensor frame and decompensated -- referenced to the sweep start -- using the shared MotionCompensator reference_timestamp_us, so NCore stores raw per-point-time directions. - Radar: AV2 has none. - Cuboids: native to the egovehicle frame at the sweep time; stored in the rig frame at that timestamp with no ego pose baked in, so egomotion stays swappable downstream (a V4 feature). Lidar unit split: AV2 shares one laser_number range [0,63] across both units with no documented up/down mapping. The two halves (<32 / >=32) are the two units; the up/down label is recovered from extrinsic geometry by per-beam elevation flatness (a ring is a constant-elevation cone only in its own sensor frame), stable per log with a wide margin. Structured VLP-32C model: AV2 provides no native firing-column index, but offset_ns + laser_number reconstruct it (offset_ns -> firing column at 10 Hz, laser_number -> beam/row). A structured model is derived per unit and stored as intrinsics with per-point model_element. Two things are essential for native-sensor accuracy: (1) the geometry is derived from the decompensated reference sweep (the compensated cloud is azimuth-smeared by ego motion ~0.5 deg); (2) per-row azimuth offsets are fit empirically (the 32 beams of a column span ~8.5 deg, so they are not co-azimuthal). The two units fire in opposite phase and thus spin oppositely in their own frames (one cw, one ccw), detected from the data. Result: ~0.0 deg median / ~0.02 deg p95 far-range reconstruction. --lidar-model-source {nominal,none} gates emission. Coordinate frames: the first ego pose's city_SE3_egovehicle is stored as the static world -> world_global anchor, so world_global is the AV2 city frame and absolute coordinates stay recoverable for later HD-map alignment. Includes an AV2_DIR-gated integration test (validated against a real val log), docs, and dependency wiring.
d4baeb8 to
f6ed44b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a converter for the Argoverse 2 (AV2) Sensor Dataset to NCore V4, closing the Argoverse 2 half of #123 (the nuScenes half landed in #128).
The converter reads the AV2 on-disk Apache Feather files directly with
pyarrow(Arrow tables, no pandas) and deliberately avoids the heavyav2devkit (which pulls in torch, kornia, numba, polars, PyAV). The only added dependency ispyquaternion, already used in this package.Closes #123
Sensors
Konly and does not load thek1, k2, k3columns -- so the stored model is an ideal (distortion-free) pinhole (IdealPinholeCameraModelParameters,ShutterType.GLOBAL). Thek1, k2, k3inintrinsics.featherdescribe the original lens (for re-distorting into the raw frame) and are intentionally not applied. Note: the stereo pair is genuinely monochrome (R=G=B in the source JPEGs).up_lidar/down_lidar) are stored separately, each with its own extrinsic. AV2 sweeps are egomotion-compensated to the sweep reference timestamp (the sweep start) and expressed in the egovehicle frame, with real per-point timestamps (offset_ns). Points are mapped into each unit's sensor frame and decompensated -- referenced to the sweep start -- using the sharedMotionCompensatorreference_timestamp_us, so NCore stores raw per-point-time directions.rigframe at that timestamp with no ego pose baked in, so egomotion stays swappable downstream (a V4 feature).Lidar unit split
AV2 shares one
laser_numberrange[0, 63]across both units with no documented up/down mapping. The two halves (< 32/>= 32) are the two physical units; the up/down label is recovered from extrinsic geometry by per-beam elevation flatness (a laser ring traces a constant-elevation cone only in its own sensor frame), decided once per log with a wide, stable margin.Structured VLP-32C lidar model
AV2 provides no native firing-column index, but
offset_ns+laser_numberreconstruct the firing pattern (offset_ns-> firing column at 10 Hz,laser_number-> beam/row). A structured model is derived per unit and stored as intrinsics with per-pointmodel_element. Reaching native-sensor accuracy required several steps, each found by evaluating withncore_evaluate_lidar_modelacross many logs:cw, oneccw); detected from the data.offset_nsdrifts ~1 deg between sweeps and the spin rate drifts slightly within a sweep on some scenes. Each frame is re-aligned by an affine column remap (constant phase + linear term).A
--lidar-model-source {empirical,none}flag (defaultempirical) gates emission.Validation
Converted 38 val logs (76 lidar units) and evaluated each with
ncore_evaluate_lidar_model:All 76 units are sub-0.08 deg median with no systematic azimuth or elevation bias -- on par with native-column sensors. (Isolated per-point maxima of 1-16 deg are dynamic objects / steepest grazing-beam returns, which no static spin model can represent; per-frame p95 stays ~0.08 deg.)
Coordinate frames
The first ego pose's
city_SE3_egovehicleis stored as the staticworld -> world_globalanchor, soworld_globalis the AV2 city frame and absolute coordinates stay recoverable for later HD-map alignment.Testing
pytest_utilsis a data-free unit test for the VLP-32C model derivation: reconstruction accuracy on a synthetic sweep, the >1-revolution column wrap, cross-frame phase generalization, and intra-sweep rate drift. Each guard was verified to fail when its fix is reverted.pytest_converter(manual /AV2_DIR-gated) validates the full pipeline against a realvallog, including a lidar-vs-cuboid alignment check and the model-reconstruction accuracy guard.Builds on the merged
MotionCompensator.reference_timestamp_us/anchor_frame_idwork (#146, #147) and the lidar-model robustness fixes (#148).