Skip to content

Expose Multi-Segment / Scatter-Gather Payload APIs in zenoh-python#732

Open
BigTailFox wants to merge 1 commit into
eclipse-zenoh:mainfrom
BigTailFox:feat/segmentable-payload
Open

Expose Multi-Segment / Scatter-Gather Payload APIs in zenoh-python#732
BigTailFox wants to merge 1 commit into
eclipse-zenoh:mainfrom
BigTailFox:feat/segmentable-payload

Conversation

@BigTailFox
Copy link
Copy Markdown

@BigTailFox BigTailFox commented Jun 2, 2026

This commit contains a prototype implementation for #730.

It adds a ZBytes.from_segments(..., copy=False) path that can construct a ZBytes payload from multiple immutable Python byte segments without copying the segment contents into Zenoh-owned memory.

The main motivation is to support efficient integration with serialization libraries that naturally expose multipart payloads, such as Cap’n Proto via msg.to_segments().

Motivation

Some Python serialization stacks produce payloads as multiple byte segments instead of one contiguous bytes object.

For example, Cap’n Proto can serialize a message as:

segments = msg.to_segments()

where segments is a list of Python bytes.

Before this change, passing those segments into Zenoh still required copying the serialized payload into Zenoh-owned memory. For large camera frames, tensors, or robotics simulation payloads, this extra copy can be a significant part of the publisher-side latency.

The proposed copy=False mode allows ZBytes to retain references to the original immutable Python segment owners and expose them as Zenoh payload slices, avoiding this additional copy.

Example Usage

segments = msg.to_segments()
payload = ZBytes.from_segments(segments, copy=False)

publisher.put(payload)

Performance Evaluation

I benchmarked the new ZBytes.from_segments(..., copy=False) path with a Cap’n Proto tensor message payload.

The test payload is a 1920 x 1536 int32 tensor, serialized as a Cap’n Proto message:

  • Raw tensor size: ~11.25 MiB
  • Cap’n Proto to_segments() output:
    • segment 0: ~0.039 KiB
    • segment 1: ~11,520.008 KiB
  • Segment type: Python bytes

Benchmark Results

For constructing ZBytes from Cap’n Proto segments:

ZBytes.from_segments(..., copy=False)
mean = 0.0074 ms
p50  = 0.0064 ms
p90  = 0.0097 ms
p99  = 0.0182 ms

ZBytes.from_segments(..., copy=True)
mean = 4.1375 ms
p50  = 3.0387 ms
p90  = 7.4486 ms
p99  = 10.4271 ms

In this benchmark, the zero-copy path reduces the ZBytes construction cost from approximately 4.1 ms to approximately 0.007 ms for an 11.25 MiB payload.

This changes the cost model from being proportional to the total payload size to being proportional to the number of segments.

End-to-End Context

In the tested Cap’n Proto path, the relevant publisher-side payload construction pipeline is:

Cap’n Proto message
  -> msg.to_segments()
  -> ZBytes.from_segments(...)

The measured msg.to_segments() cost is approximately:

msg.to_segments()
mean = 4.9673 ms
p50  = 6.4139 ms
p90  = 7.8883 ms
p99  = 12.6092 ms

With copy=True, the path is approximately:

msg.to_segments()                  ~4.97 ms
ZBytes.from_segments(copy=True)    ~4.14 ms
Total                              ~9.10 ms

With copy=False, the path becomes:

msg.to_segments()                  ~4.97 ms
ZBytes.from_segments(copy=False)   ~0.007 ms
Total                              ~4.98 ms

Therefore, for this image-sized raw tensor payload, this prototype removes approximately 4 ms of publisher-side memory copy overhead, reducing this part of the payload construction path by roughly 45%.

This is particularly useful for robotics simulation workloads where large camera images, segmentation masks, depth maps, or tensor payloads are published at high frequency.

Scope and Limitations

This is not a full end-to-end zero-copy path from the original NumPy / Torch tensor memory into the Zenoh transport layer.

In the Cap’n Proto case, msg.to_segments() still materializes Python bytes segments. This PR specifically removes the additional copy from:

Python bytes segments -> ZBytes / Zenoh-owned payload memory

So the optimization is still valuable for existing Python serialization libraries that already expose immutable byte segments.

Lifetime and Safety Model

The intended safety model is:

  • copy=True

    • Copies each segment into Zenoh-owned memory.
    • The resulting ZBytes is independent of the original Python objects.
  • copy=False

    • Does not copy segment contents.
    • ZBytes retains ownership references to the Python segment objects.
    • The original Python segment memory remains alive for as long as Zenoh may still reference the payload.

For the initial prototype, I intentionally kept the supported input types conservative. The safest zero-copy case is immutable Python bytes, because the memory cannot be mutated after the ZBytes has been created.

Mutable buffers such as bytearray or writable memoryview are more subtle because the application could mutate the payload while Zenoh is still using it. I would like maintainer feedback on what input types should be accepted for copy=False.

Questions for Maintainers

I would appreciate feedback on the following points before turning this prototype into a final implementation:

1. API shape

Is ZBytes.from_segments(segments, copy=False) the right API, or would maintainers prefer a more explicit name for the borrowed / retained-owner path?

For example:

ZBytes.from_segments(segments, copy=False)
ZBytes.from_segments_borrowed(segments)
ZBytes.from_segments_zero_copy(segments)

2. Supported input types for copy=False

Should the initial implementation only accept immutable bytes, or should it also accept read-only memoryview objects when they are C-contiguous and byte-addressable?

My current preference is to start conservatively with immutable / read-only inputs only.

3. Fallback behavior

If an unsupported buffer type is passed with copy=False, should the function:

  • raise an error, or
  • silently fall back to copying?

My current preference is to raise an explicit error, because silent fallback would make performance unpredictable.

4. Lifetime guarantees

Does retaining Python object owners inside ZBytes match the expected Zenoh payload lifetime model?

The goal is to ensure that the Python segment owners remain alive until Zenoh no longer references the payload, including asynchronous publisher.put(...) use cases.

5. Documentation expectations

Are there specific Zenoh Python documentation conventions I should follow for documenting the difference between copy=True and copy=False?

6. Test coverage

Besides basic round-trip tests, I plan to add tests for:

  • multipart bytes segments
  • copy=True vs copy=False behavior
  • empty segments
  • payload round-trip through payload.segments()
  • rejection of unsupported mutable buffers when copy=False

Please let me know if there are additional edge cases maintainers would like covered.

Summary

This prototype shows that avoiding the extra Python-segments-to-Zenoh copy can save around 4 ms for an 11.25 MiB Cap’n Proto payload.

The implementation is intended to be a conservative and practical zero-copy bridge for serialization libraries that already expose immutable multipart byte segments.

I would appreciate maintainer feedback on the API shape, accepted input types, fallback behavior, and lifetime model before polishing this into a final PR.

…rt multipart payload and potential zero-copy path between python buffer object and zenoh-python.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant