Skip to content

fix multipart RDMA: propagate rdmaclient, per-part CRC64NVME, complete XML#226

Merged
harshavardhana merged 2 commits into
minio:mainfrom
harshavardhana:fix/rdma-multipart-checksum
May 27, 2026
Merged

fix multipart RDMA: propagate rdmaclient, per-part CRC64NVME, complete XML#226
harshavardhana merged 2 commits into
minio:mainfrom
harshavardhana:fix/rdma-multipart-checksum

Conversation

@harshavardhana
Copy link
Copy Markdown
Member

Summary

The multipart RDMA path (added in #214) silently never went over RDMA, and once a server-side CRC64NVME requirement was added it started failing with InvalidPart on Complete. Three latent bugs:

  1. Client::PutObject's multipart loop never propagated rdmaclient to UploadPartArgs. BaseClient::UploadPart's RDMA branch (which keys on args.rdmaclient != nullptr) was unreachable for multipart — every part went over plain HTTP.
  2. No per-part CRC64NVME was computed or sent. When the server (or bucket policy) requires CRC64NVME, the part request was rejected with (checksum missing, want "CRC64NVME", got "").
  3. Part struct had no checksum field and CompleteMultipartUpload XML omitted <ChecksumCRC64NVME>. Even if parts uploaded successfully, the server's part-list validation failed with InvalidPart.

Changes

  • utils.h / utils.cc: add Crc64Nvme(data, len) and Crc64NvmeBase64(data, len). Table-driven NVMe E2E CRC-64 (polynomial 0xad93d23594c93659, reflected, init/xor 0xffffffffffffffff), lazily initialised via local static. Verified against the NVMe spec test vector 123456789 → 0xae8b14860a799888.
  • types.h: add Part::checksum_crc64nvme and a 3-arg constructor.
  • baseclient.cc: emit <ChecksumCRC64NVME> in CompleteMultipartUpload XML when the per-part checksum is non-empty.
  • client.cc: in PutObject's multipart loop — declare x-amz-checksum-algorithm: CRC64NVME on Create, propagate up_args.rdmaclient, compute per-part CRC64NVME on host buffers (cuObjClient::getMemoryType == CUOBJ_MEMORY_SYSTEM), set both up_args.checksum_crc64nvme (consumed by rdma.h) and up_args.headers[\"x-amz-checksum-crc64nvme\"] (consumed by the HTTP fallback in BaseClient::UploadPart), and propagate the checksum into the assembled Part list.

GPU buffers (CUDA_DEVICE / CUDA_MANAGED) cannot be hashed by the CPU; in that case we leave the checksum empty and the RDMA path will surface a 501 to the caller, or the caller supplies a pre-computed checksum via args.headers.

Verification

Tested against a live minio.rdma server (15.15.15.59:9200) across a Mellanox mlx5_0 fabric, using the existing PutObject example with a 20 MiB file forcing 2 parts (16 MiB + 4 MiB):

$ ./PutObject 15.15.15.59:9200 minioadmin minioadmin
my-object is successfully created etag=fe2f8e526e386a020064a32911319808-2

The -2 suffix on the final ETag confirms the multipart path ran, and the upload no longer falls back to HTTP.

Test plan

  • Reviewer builds with -DMINIO_CPP_ENABLE_RDMA=ON — should compile clean.
  • Reviewer builds with the default (no RDMA) — should be unaffected (Part 3-arg ctor is benign; Complete XML change only emits the element when checksum is non-empty).
  • If reviewer has access to a cuObjServer-aware MinIO endpoint and an RDMA fabric, run any large-file PutObject (>16 MiB) and confirm it succeeds.

Companion PR

A matching native multipart-RDMA implementation in minio-rs is posted alongside this one (it ran into the same CRC64NVME and Complete-XML requirements during initial development; both SDKs now agree against the same server).

…e XML

The RDMA UploadPart server path requires a per-part CRC64NVME checksum
when CreateMultipartUpload declared the algorithm, and the
CompleteMultipartUpload request must include the per-part checksum so
the server can verify the assembled object. Three latent bugs in the
multipart RDMA path caused every multipart upload to either silently
fall back to HTTP or fail with InvalidPart on Complete:

1. Client::PutObject's multipart loop never set
   `up_args.rdmaclient = args.rdmaclient`, so BaseClient::UploadPart's
   RDMA branch (which keys on `args.rdmaclient != nullptr`) was
   unreachable for multipart. Every part went over plain HTTP.

2. Even with rdmaclient propagated, no per-part CRC64NVME was computed
   or attached, so the server rejected with
   `(checksum missing, want "CRC64NVME", got "")` after we started
   declaring the algorithm on Create.

3. The Part struct had no checksum field and the Complete XML omitted
   `<ChecksumCRC64NVME>`, so the server's part-list validation failed
   with InvalidPart even when the parts uploaded successfully.

Changes:

- utils: add `Crc64Nvme` / `Crc64NvmeBase64` (NVMe E2E CRC-64, polynomial
  0xad93d23594c93659, reflected, init/xor 0xffffffffffffffff). Table is
  lazily built once via local static, verified against the NVMe spec
  test vector `123456789 -> 0xae8b14860a799888`.
- types: add `Part::checksum_crc64nvme` and a 3-arg constructor.
- baseclient: emit `<ChecksumCRC64NVME>` in CompleteMultipartUpload XML
  when the per-part checksum is non-empty.
- client: in PutObject's multipart loop, declare CRC64NVME on Create,
  propagate `up_args.rdmaclient`, compute the per-part checksum on host
  buffers (`cuObjClient::getMemoryType == CUOBJ_MEMORY_SYSTEM`), set
  both `up_args.checksum_crc64nvme` (consumed by the RDMA path in
  rdma.h) and `up_args.headers["x-amz-checksum-crc64nvme"]` (consumed
  by the HTTP fallback in BaseClient::UploadPart), and propagate the
  checksum into the assembled Part list.

GPU buffers cannot be hashed by the CPU; when `getMemoryType` returns
CUDA_DEVICE / CUDA_MANAGED we leave the checksum empty and the RDMA
path will surface the 501 to the caller (or the caller supplies a
pre-computed checksum via headers).

Verified end-to-end against a `minio.rdma` server with a 20 MiB file
forcing 2 parts (16 MiB + 4 MiB) on the standard `PutObject` example:
final ETag `fe2f8e526e386a020064a32911319808-2`.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes three latent bugs in the multipart RDMA upload path introduced in #214: (1) rdmaclient was not propagated to per-part UploadPartArgs, (2) no per-part CRC64NVME was computed or sent, and (3) the Part struct/CompleteMultipartUpload XML had no ChecksumCRC64NVME field, causing InvalidPart failures from servers that require the checksum.

Changes:

  • Add utils::Crc64Nvme / utils::Crc64NvmeBase64 helpers (table-driven NVMe E2E CRC-64).
  • Add Part::checksum_crc64nvme and emit <ChecksumCRC64NVME> in the Complete XML.
  • In Client::PutObject's multipart loop, declare CRC64NVME on Create, propagate rdmaclient, compute per-part CRC on host buffers, and pass the checksum through to the assembled Part.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
include/miniocpp/utils.h Declares the new CRC64NVME helpers.
src/utils.cc Implements the reflected-poly CRC64NVME and its base64 wrapper.
include/miniocpp/types.h Adds checksum_crc64nvme field and 3-arg constructor to Part.
src/baseclient.cc Emits <ChecksumCRC64NVME> in CompleteMultipartUpload XML when present.
src/client.cc Multipart loop now declares CRC64NVME, propagates rdmaclient, computes per-part CRC, and forwards it into the Part list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/client.cc Outdated
- Run clang-format-20 --style=Google on src/utils.cc and src/client.cc
  to satisfy the repo's coding style check.
- In Client::PutObject's multipart loop, build the assembled Part with
  up_args.checksum_crc64nvme instead of resp.checksum_crc64nvme.
  BaseClient::UploadPart's HTTP fallback path does not populate the
  response field from x-amz-checksum-crc64nvme, so an HTTP-fallback
  part would otherwise omit <ChecksumCRC64NVME> in
  CompleteMultipartUpload and trigger InvalidPart when the algorithm
  was declared on Create.
@harshavardhana harshavardhana merged commit f35873b into minio:main May 27, 2026
7 checks passed
@harshavardhana harshavardhana deleted the fix/rdma-multipart-checksum branch May 27, 2026 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants