Skip to content

fix: make STDIO transport sendMessage thread-safe (#304)#935

Open
nquinquenel wants to merge 1 commit intomodelcontextprotocol:mainfrom
nquinquenel:fix/304-stdio-thread-safety
Open

fix: make STDIO transport sendMessage thread-safe (#304)#935
nquinquenel wants to merge 1 commit intomodelcontextprotocol:mainfrom
nquinquenel:fix/304-stdio-thread-safety

Conversation

@nquinquenel
Copy link
Copy Markdown

@nquinquenel nquinquenel commented Apr 25, 2026

Hello, we've had this issue #304 for a while in our MCP server, and I implemented this fix following your recommendation.

As a summary, concurrent calls to sendMessage on the STDIO transports raced on the unicast Sinks.Many and sporadically failed with FAIL_NON_SERIALIZED, dropping the message with RuntimeException("Failed to enqueue message"). This happens in practice whenever tools and loggingNotification are emitted from different threads on the server, and can happen on the client under any concurrent sender.

I switched tryEmitNext to emitNext(message, Sinks.EmitFailureHandler.busyLooping(Duration.ofMillis(100))) as suggested by the maintainers in the issue, so concurrent emissions are serialized. Applied on both sides:

  • mcp-core/src/main/java/io/modelcontextprotocol/server/transport/StdioServerTransportProvider.java
  • mcp-core/src/main/java/io/modelcontextprotocol/client/transport/StdioClientTransport.java

The remaining catch (Sinks.EmissionException) path still returns a failed Mono for the non-retryable modes (FAIL_CANCELLED, FAIL_TERMINATED, busyLooping timeout), preserving the existing error semantics.

I also removed the TODO, as I believe that the only realistic failure (FAIL_NON_SERIALIZED) is now handled in-place by busyLooping. The other scenario such as FAIL_OVERFLOW is unreachable with the current sink - onBackpressureBuffer() is unbounded so there's no overflow to reschedule. Happy to revert if you'd rather keep the hint for a future bounded-buffer refactor.

Test plan

  • Added shouldHandleConcurrentSendMessage regression test that fans 500 sendMessage calls across 16 parallel rails and verifies all complete and every message lands on the output stream. The test fails on main (RuntimeException: Failed to enqueue message after ~44 emissions) and passes with the fix.
  • ./mvnw -pl mcp-test -am test -Dtest='StdioServerTransportProviderTests,StdioMcpSyncClientTests' - 52/52 pass.
  • StdioMcpAsyncClientTests pass in isolation with the fix.

…ol#304)

Concurrent calls to `sendMessage` on the STDIO transports raced on the
unicast sink and sporadically failed with `FAIL_NON_SERIALIZED`, dropping
the message with `RuntimeException("Failed to enqueue message")`.

Switch `tryEmitNext` to `emitNext` with a `busyLooping` EmitFailureHandler
so concurrent emissions are serialized as recommended by the Reactor
maintainers. Applied on both `StdioServerTransportProvider` and
`StdioClientTransport`.

Also adds a regression test that fans `sendMessage` out across 16 parallel
rails and verifies all 500 emissions complete successfully and every
message is written to the output stream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant