WS callbacks are dispatched to a ThreadPoolExecutor sized min(32, cpu_count+4) (base_server.py:194-209). Digging into performance implications, this can pose issues because it means WS callback scaling is:
- Not configurable. Both call sites (
_fastapi.py:735, _quart.py:563) invoke get_callback_executor() with no max_workers, and no constructor param exposes it. Hitting the ~20-connection cap leaves "deploy more workers" as the only escape.
- Async callbacks run one-per-thread.
ws.py:435 calls asyncio.run() inside the worker thread, spinning a fresh event loop per callback. A persistent=True callback holds its thread (~8 MB stack) for the whole session while mostly await-ing
Users need to brute force scaling by adding more, and more, and more CPU cores, when one CPU core could handle many more threads than are currently allocated.
Proposed changes (independent):
- Quick win: expose
websocket_max_workers on the Dash constructor, threaded to the two call sites via getattr(dash_app, "_websocket_max_workers", None). ~5 lines, backward-compatible (None = current default). Lets users raise the cap past 32. This would scale memory linearly — 1000 threads ≈ 8 GB stacks.
- Durable fix: route
inspect.iscoroutine results onto the backend's existing event loop as tasks; keep the thread pool only for sync callbacks. Takes async-callback concurrency from "threads × 8 MB" to "coroutines × KB."
Docs (https://dash.plotly.com/websocket-callbacks#connection-limits-and-scaling) accurately describe current behavior and would need updating.
WS callbacks are dispatched to a
ThreadPoolExecutorsizedmin(32, cpu_count+4)(base_server.py:194-209). Digging into performance implications, this can pose issues because it means WS callback scaling is:_fastapi.py:735,_quart.py:563) invokeget_callback_executor()with nomax_workers, and no constructor param exposes it. Hitting the ~20-connection cap leaves "deploy more workers" as the only escape.ws.py:435callsasyncio.run()inside the worker thread, spinning a fresh event loop per callback. Apersistent=Truecallback holds its thread (~8 MB stack) for the whole session while mostlyawait-ingUsers need to brute force scaling by adding more, and more, and more CPU cores, when one CPU core could handle many more threads than are currently allocated.
Proposed changes (independent):
websocket_max_workerson theDashconstructor, threaded to the two call sites viagetattr(dash_app, "_websocket_max_workers", None). ~5 lines, backward-compatible (None= current default). Lets users raise the cap past 32. This would scale memory linearly — 1000 threads ≈ 8 GB stacks.inspect.iscoroutineresults onto the backend's existing event loop as tasks; keep the thread pool only for sync callbacks. Takes async-callback concurrency from "threads × 8 MB" to "coroutines × KB."Docs (https://dash.plotly.com/websocket-callbacks#connection-limits-and-scaling) accurately describe current behavior and would need updating.