feat(ai-aliyun-content-moderation): moderate system and tool role content by AlinsRan · Pull Request #13646 · apache/apisix

AlinsRan · 2026-07-02T08:09:24Z

Description

Extends the request-side moderation of ai-aliyun-content-moderation beyond the user role, to cover the Agent + tool-calling (MCP) threat model where tool results and a poisoned system prompt can carry indirect prompt injection into the next LLM turn.

New request_check_roles option (array, default ["user"], fully backward compatible):

user / tool follow the existing request_check_mode. last now walks the trailing consecutive block of selected-role messages, so a fresh user turn or the current round's tool results are moderated without re-checking conversation history.
system ignores request_check_mode and is moderated on every request (all system messages), because it can be poisoned by malicious ToolCall arguments overwriting the system prompt.

Protocol layer gains extract_turn_content(body, mode, roles) and extract_system_content(body) across openai-chat / anthropic-messages / openai-responses / bedrock-converse / openai-embeddings. When a configured role has no extractor on the current protocol, the request is routed through binding.on_unsupported(...) so fail_mode decides, instead of silently passing unmoderated.

With the default ["user"], extract_turn_content(body, mode, {user=true}) is equivalent to the previous extract_user_content(body, mode), so existing behavior is unchanged.

Note: tool-result moderation applies to OpenAI-compatible formats where the tool output is a distinct tool role/item; Anthropic/Bedrock nest tool results inside user messages and are not extracted (documented in the option table).

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change (unit tests for the extractors + end-to-end role moderation, t/plugin/ai-aliyun-content-moderation.t TEST 57–72)
I have updated the documentation (en/zh ai-aliyun-content-moderation.md)
I have verified backward compatibility (default ["user"] preserves prior behavior)

…tent Extend request-side moderation beyond the user role via a new request_check_roles option (array, default ["user"], backward compatible): - user/tool follow request_check_mode; "last" walks the trailing block of selected-role messages, so a fresh user turn or the current round's tool results are moderated without re-checking history. - system ignores request_check_mode and is moderated on every request (all system messages), because it can be poisoned by malicious ToolCall arguments overwriting the system prompt. Protocol layer gains extract_turn_content(body, mode, roles) and extract_system_content(body) across openai-chat/anthropic-messages/ openai-responses/bedrock-converse/openai-embeddings. A configured role with no extractor on the current protocol is routed through binding.on_unsupported so fail_mode decides, instead of silently passing unmoderated. Note: tool-result moderation applies to OpenAI-compatible formats where the tool output is a distinct tool role; Anthropic/Bedrock nest tool results in user messages and are not extracted (documented).

…ation-system-tool-roles # Conflicts: # apisix/plugins/ai-protocols/bedrock-converse.lua # apisix/plugins/ai-protocols/openai-responses.lua

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jul 2, 2026

Merge remote-tracking branch 'upstream/master' into feat/aliyun-moder…

019e721

…ation-system-tool-roles # Conflicts: # apisix/plugins/ai-protocols/bedrock-converse.lua # apisix/plugins/ai-protocols/openai-responses.lua

shreemaan-abhishek approved these changes Jul 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ai-aliyun-content-moderation): moderate system and tool role content#13646

feat(ai-aliyun-content-moderation): moderate system and tool role content#13646
AlinsRan wants to merge 2 commits into
apache:masterfrom
AlinsRan:feat/aliyun-moderation-system-tool-roles

AlinsRan commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AlinsRan commented Jul 2, 2026

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants