Skip to content

fix(agent): resolve serializeOnKey gate leak in Flux.create callbacks#1796

Merged
chickenlj merged 2 commits into
agentscope-ai:mainfrom
partick33:fix/reactagent-serializeonkey-gate-leak
Jun 26, 2026
Merged

fix(agent): resolve serializeOnKey gate leak in Flux.create callbacks#1796
chickenlj merged 2 commits into
agentscope-ai:mainfrom
partick33:fix/reactagent-serializeonkey-gate-leak

Conversation

@partick33

@partick33 partick33 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Closes #1798

问题现象

当用户实现自定义 MiddlewareBase 并在 onAgent 等方法中抛出未被捕获的异常时,streamEvents() 返回的外部 Flux 进入 error 状态,但 Flux.create 内部的 lifecycle.subscribe() 创建的独立订阅继续运行,导致 serializeOnKey gate 永不释放。同一 session 的后续调用永久阻塞。

根因

ReActAgent.buildAgentStream() / AgentBase.createEventStream()Flux.create 内部 .subscribe() 返回的 Disposable 未保存,也未注册 sink.onCancel(disposable)。外部 Flux cancel 无法传播到内部 lifecycle Mono 订阅。

修复

三个 Flux.create 位置各加一行:sink.onCancel(disposable)

文件 方法
ReActAgent.java buildAgentStream()
ReActAgent.java executeApprovedTools()
AgentBase.java createEventStream()

验证

  • mvn spotless:apply
  • mvn test -pl agentscope-core — 3395 tests, 0 failures
  • ✅ 新增单元测试验证自定义 middleware 抛异常后 gate 正确释放

chickenlj and others added 2 commits June 17, 2026 16:10
When a middleware throws an Error during agent event streaming,
the external Flux is cancelled but the internal lifecycle Mono
subscription inside Flux.create survives independently, holding
the per-session serializeOnKey gate open indefinitely. Subsequent
calls on the same session then block forever.

Fix: capture the Disposable from each internal .subscribe() call
and register it via sink.onCancel(disposable) so the lifecycle
subscription is disposed when the external Flux is cancelled.

Affected locations:
- ReActAgent.buildAgentStream() (line 800)
- ReActAgent.executeApprovedTools() (line 2404)
- AgentBase.createEventStream() (line 984)

Adds unit test verifying gate release after middleware error.
@partick33 partick33 requested a review from a team June 17, 2026 08:14
@AgentScopeJavaBot AgentScopeJavaBot added bug Something isn't working area/core/agent Agent runtime, pipeline, hooks, plan labels Jun 17, 2026

@AgentScopeJavaBot AgentScopeJavaBot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Review

This PR fixes a serializeOnKey gate leak in Flux.create callbacks. When a custom MiddlewareBase throws an uncaught exception, the outer Flux is cancelled but the inner lifecycle Mono subscription continues holding the gate, blocking subsequent calls for the same session. The fix adds sink.onCancel(disposable) at three Flux.create locations, ensuring the inner subscription is disposed when the sink is cancelled. The root cause analysis is accurate, the fix follows Reactor best practices, and the test precisely reproduces the problem scenario. AgentBase hook cleanup is also improved to use doFinally instead of onDispose.

"reasoning and modelCall enter counts must match");
}

/**

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Javadoc states throws an {@link Error} (not {@code Exception}) but the test actually throws a RuntimeException. Consider updating the Javadoc to match the actual code.

@AgentScopeJavaBot AgentScopeJavaBot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Review

This PR fixes a serializeOnKey gate leak in Flux.create callbacks. When a custom MiddlewareBase throws an uncaught exception, the outer Flux is cancelled but the inner lifecycle Mono subscription continues holding the gate, blocking subsequent calls for the same session. The fix adds sink.onCancel(disposable) at three Flux.create locations, ensuring the inner subscription is disposed when the sink is cancelled. The root cause analysis is accurate, the fix follows Reactor best practices, and the test precisely reproduces the problem scenario. AgentBase hook cleanup is also improved to use doFinally instead of onDispose.

"reasoning and modelCall enter counts must match");
}

/**

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Javadoc states throws an {@link Error} (not {@code Exception}) but the test actually throws a RuntimeException. Consider updating the Javadoc to match the actual code.

@chickenlj chickenlj merged commit b752e68 into agentscope-ai:main Jun 26, 2026
10 checks passed
chickenlj added a commit to chickenlj/agentscope-java that referenced this pull request Jun 26, 2026
…ncellation fix

After agentscope-ai#1796 added sink.onCancel(lifecycleDisposable), cancellation now
properly propagates to inner tool executions. The slow_tool's sleep is
interrupted before it can write to the file, so expect 0 lines instead
of 1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core/agent Agent runtime, pipeline, hooks, plan bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(agent): serializeOnKey gate leak when custom middleware throws exception

3 participants