feat(message_bus): add QUIC, TCP-TLS, WS, WSS transports for SDK clients#3192
feat(message_bus): add QUIC, TCP-TLS, WS, WSS transports for SDK clients#3192
Conversation
Replica plane stays TCP forever: VSR FIFO + view-change timing, fd-delegation, writev batching all rely on plaintext between trusted replicas. SDK-client plane gains four transports alongside TCP: - QUIC: shard-0 terminal (compio-quic CID demux), 1 bidi stream per peer, 0-RTT off + listener defense-in-depth reject. - TCP-TLS: rustls 1.3, no client auth, 0-RTT off, compio-tls behind unified TransportConn::run with bounded close_grace shutdown. - WS: compio-ws over plaintext TCP; pre-upgrade fd cross-shard handover keeps fd-delegation on plain TCP only. - WSS: WebSocketStream over TlsStream; both handshakes run on the per-connection install task. Shared: TransportListener / TransportConn trait family; WebSocketConfig + close_grace threaded through MessageBusConfig and applied uniformly across TCP-TLS, WS, WSS; bounded safe-shutdown (no select! over stream.shutdown); single-task pump per WS/WSS using compio-ws cancel-safe read. Bus auth thin: both planes connect unauthenticated; server-ng gates via LOGIN_USER / LOGIN_WITH_PAT and future LOGIN_REPLICA. Ping announces replica_id only; no subprotocol, no ALPN, no MAC. Per-connection metadata flows via IggyMessageBus::client_meta; ShardFramePayload setup variants carry ClientConnMeta end to end.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #3192 +/- ##
=============================================
- Coverage 74.08% 63.76% -10.33%
Complexity 943 943
=============================================
Files 1159 1175 +16
Lines 102033 94258 -7775
Branches 79084 71326 -7758
=============================================
- Hits 75593 60103 -15490
- Misses 23770 31300 +7530
- Partials 2670 2855 +185
🚀 New features to boost your workflow:
|
atharvalade
left a comment
There was a problem hiding this comment.
found these during first round of review... I'll continue to review later. Overall seems good
| if body.len() < iggy_binary_protocol::HEADER_SIZE { | ||
| return Err(FrameDecodeError::BadHeader); | ||
| } | ||
| let total_size = u32::from_le_bytes( |
There was a problem hiding this comment.
framing.rs has a const _: () = { assert!(offset_of!(GenericHeader, size) == 48) } guard but this function duplicates the same 48..52 magic without one. If GenericHeader layout ever shifts, TCP/QUIC get a compile error while WS/WSS silently read the wrong bytes.
| in_tx, | ||
| rx, | ||
| shutdown, | ||
| label, |
There was a problem hiding this comment.
run_pump drops max_message_size into .. and decode_consensus_frame hardcodes framing::MAX_MESSAGE_SIZE. TCP and QUIC paths honor the per-bus config value, so an operator who lowers max_message_size gets enforcement on TCP/QUIC but not WS/WSS.
| .per_client | ||
| .entry(client) | ||
| .or_insert_with(|| PerClient::with_capacity(self.per_client_capacity)); | ||
| if state.find(request).is_some() { |
There was a problem hiding this comment.
lookup treats TTL-expired Done entries as Fresh, but mark_in_flight calls find() without a TTL check. A client retrying after TTL expiry sees Fresh from lookup then gets false from mark_in_flight because the physical slot still exists. The retry is silently dropped.
| let (server_out, server_in, server_shutdown, server_handle) = drive(server_conn); | ||
| let (client_out, client_in, client_shutdown, client_handle) = drive(client_conn); | ||
|
|
||
| client_out |
There was a problem hiding this comment.
TCP-TLS's drive_close calls tls.shutdown() which sends close_notify, but WSS just does ws.close() + drop. The peer's rustls sees an unexpected EOF on the record layer, which can trigger false-positive alerts in TLS-aware load balancers or WAFs sitting in front.
Replica plane stays TCP forever: VSR FIFO + view-change timing,
fd-delegation, writev batching all rely on plaintext between trusted
replicas. SDK-client plane gains four transports alongside TCP:
peer, 0-RTT off + listener defense-in-depth reject.
unified TransportConn::run with bounded close_grace shutdown.
handover keeps fd-delegation on plain TCP only.
per-connection install task.
Shared: TransportListener / TransportConn trait family; WebSocketConfig
across TCP-TLS, WS, WSS; bounded safe-shutdown (no select! over
stream.shutdown); single-task pump per WS/WSS using compio-ws
cancel-safe read. Bus auth thin: both planes connect unauthenticated;
server-ng gates via LOGIN_USER / LOGIN_WITH_PAT and future
LOGIN_REPLICA. Ping announces replica_id only; no subprotocol, no
ALPN, no MAC. Per-connection metadata flows via
IggyMessageBus::client_meta; ShardFramePayload setup variants carry
ClientConnMeta end to end.