Skip to content

refactor(ai-aws-content-moderation): moderate decoded LLM content in access phase#13647

Open
shreemaan-abhishek wants to merge 2 commits into
apache:masterfrom
shreemaan-abhishek:refactor/ai-aws-content-moderation-access
Open

refactor(ai-aws-content-moderation): moderate decoded LLM content in access phase#13647
shreemaan-abhishek wants to merge 2 commits into
apache:masterfrom
shreemaan-abhishek:refactor/ai-aws-content-moderation-access

Conversation

@shreemaan-abhishek

@shreemaan-abhishek shreemaan-abhishek commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Description

Refactor ai-aws-content-moderation to be structured like its sibling ai-aliyun-content-moderation.

Previously the plugin ran in the rewrite phase (before ai-proxy) and sent the raw HTTP request body to AWS Comprehend. As a result Comprehend scored the undecoded JSON envelope (e.g. the literal escaped string "content":"toxic" and {"model":...,"messages":[...]}), while the upstream LLM acts on the decoded prompt, so the two saw different text.

What changed

  • Move the plugin from the rewrite phase to the access phase (priority 1050 -> 1031, below ai-proxy's 1040), so it runs after ai-proxy and can reuse ctx.ai_client_protocol and ctx.picked_ai_instance.
  • Make it protocol-aware: parse the JSON body and extract the LLM-visible prompt content via the detected protocol (ai-protocols), sending only the normalized, decoded content to Comprehend. This is what the sibling ai-aliyun-content-moderation plugin does.
  • Guard on ctx.picked_ai_instance via fail_mode, so the plugin reports clearly when it is used without ai-proxy/ai-proxy-multi.
  • Return a provider-compatible deny response (via ai-protocols) instead of a raw text body, so AI clients are not broken. Adds check_request, deny_code (default 200) and deny_message options.

The AWS Comprehend decision model (moderation_categories per-category thresholds and the overall moderation_threshold) is unchanged.

Behavior change

The plugin now must be used together with ai-proxy/ai-proxy-multi, consistent with ai-aliyun-content-moderation and with how the plugin is documented. A request that does not pass through ai-proxy is governed by fail_mode (default skip, i.e. passes through unchecked; set fail_mode: error to reject). This supersedes #13528.

Which issue(s) this PR fixes:

Fixes #

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

…access phase

Restructure the plugin to match ai-aliyun-content-moderation: run in the
access phase after ai-proxy, extract the decoded LLM prompt content via the
detected client protocol instead of sending the raw request body to
Comprehend, guard on ctx.picked_ai_instance via fail_mode, and return a
provider-compatible deny response. Adds check_request, deny_code and
deny_message options.
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jul 2, 2026
1030 collided with ai-rate-limiting; move to 1031 (still below ai-proxy) and
update the priority-ordered plugin list in t/admin/plugins.t.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant