refactor(ai-aws-content-moderation): moderate decoded LLM content in access phase#13647
Open
shreemaan-abhishek wants to merge 2 commits into
Open
Conversation
…access phase Restructure the plugin to match ai-aliyun-content-moderation: run in the access phase after ai-proxy, extract the decoded LLM prompt content via the detected client protocol instead of sending the raw request body to Comprehend, guard on ctx.picked_ai_instance via fail_mode, and return a provider-compatible deny response. Adds check_request, deny_code and deny_message options.
5 tasks
1030 collided with ai-rate-limiting; move to 1031 (still below ai-proxy) and update the priority-ordered plugin list in t/admin/plugins.t.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Refactor
ai-aws-content-moderationto be structured like its siblingai-aliyun-content-moderation.Previously the plugin ran in the
rewritephase (beforeai-proxy) and sent the raw HTTP request body to AWS Comprehend. As a result Comprehend scored the undecoded JSON envelope (e.g. the literal escaped string"content":"toxic"and{"model":...,"messages":[...]}), while the upstream LLM acts on the decoded prompt, so the two saw different text.What changed
rewritephase to theaccessphase (priority1050->1031, belowai-proxy's1040), so it runs afterai-proxyand can reusectx.ai_client_protocolandctx.picked_ai_instance.ai-protocols), sending only the normalized, decoded content to Comprehend. This is what the siblingai-aliyun-content-moderationplugin does.ctx.picked_ai_instanceviafail_mode, so the plugin reports clearly when it is used withoutai-proxy/ai-proxy-multi.ai-protocols) instead of a raw text body, so AI clients are not broken. Addscheck_request,deny_code(default200) anddeny_messageoptions.The AWS Comprehend decision model (
moderation_categoriesper-category thresholds and the overallmoderation_threshold) is unchanged.Behavior change
The plugin now must be used together with
ai-proxy/ai-proxy-multi, consistent withai-aliyun-content-moderationand with how the plugin is documented. A request that does not pass throughai-proxyis governed byfail_mode(defaultskip, i.e. passes through unchecked; setfail_mode: errorto reject). This supersedes #13528.Which issue(s) this PR fixes:
Fixes #
Checklist