Skip to content

OAuth client: harden SEP-2352/SEP-2350 edge cases; fix conformance comment#2936

Open
maxisbey wants to merge 2 commits into
mainfrom
bughunter/followups-2026-06-20
Open

OAuth client: harden SEP-2352/SEP-2350 edge cases; fix conformance comment#2936
maxisbey wants to merge 2 commits into
mainfrom
bughunter/followups-2026-06-20

Conversation

@maxisbey

Copy link
Copy Markdown
Contributor

Three small follow-ups to #2933 / #2931 plus a comment correction from #2911.

Motivation and Context

Review of the merged SEP-2352 and SEP-2350 implementations turned up a few narrow paths where the new guarantees don't hold:

  • SEP-2352, no-PRM path: the issuer-binding check runs before authorization-server metadata discovery and is gated on auth_server_url. When PRM discovery fails and the AS is found via the root /.well-known/oauth-authorization-server fallback, the recorded issuer is never compared, so credentials bound to a previous AS are reused. The stamping side already falls back to oauth_metadata.issuer in this case; this adds the matching comparison.
  • SEP-2352, stale metadata: when the issuer-mismatch block discards bound credentials, cached oauth_metadata from the old AS is left in place. If discovery for the new AS then fails, dynamic client registration runs against the old server's registration_endpoint while stamping the new issuer, persisting a mis-bound record.
  • SEP-2350, omitted scope: RFC 6749 §5.1 lets the AS omit scope from the token response when granted scope equals requested scope. After a process restart the reloaded token then has scope=None and client_metadata.scope has reverted to its constructor value, so the step-up union collapses to just the challenged scope. Backfilling the requested scope before persisting (and carrying the prior scope forward on refresh, per §6) keeps the stored token self-describing.

The conformance-client comment correction is doc-only: the harness picks LATEST_SPEC_VERSION vs DRAFT_PROTOCOL_VERSION per-scenario when --spec-version is omitted, not a flat LATEST_SPEC_VERSION. The value is still log-only today.

How Has This Been Tested?

Three new tests in tests/client/test_auth.py:

  • test_handle_token_response_backfills_omitted_scope_from_request
  • test_handle_refresh_response_carries_prior_scope_when_response_omits_it
  • test_issuer_binding_re_evaluated_after_asm_when_prm_discovery_failed — drives the auth-flow generator through PRM 404×2 → root ASM 200 with a mismatched issuer and asserts the next yield is a DCR request, not the stale client's authorize redirect

The existing tests/interaction/auth/test_lifecycle.py SEP-2352 migration test exercises the oauth_metadata = None clear on the PRM-success path.

Breaking Changes

None.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

The 403 step-up path also skips PRM/ASM discovery entirely, so after a restart it can fall back to /authorize and /token on the resource server's origin rather than the AS. That fix needs the discovery sequence factored into a sub-generator both the 401 and 403 branches can drive — out of scope for this PR.

AI Disclaimer

maxisbey added 2 commits June 20, 2026 21:32
When --spec-version is omitted, the conformance harness picks the version
per-scenario (LATEST_SPEC_VERSION for active scenarios, DRAFT_PROTOCOL_VERSION
for draft-only ones), not a flat LATEST_SPEC_VERSION. The variable is still
log-only today, but the stateless 2026 client path will branch on it.
Three follow-up fixes to the SEP-2352 / SEP-2350 work:

- Backfill an omitted token-response `scope` with the requested scope
  (RFC 6749 §5.1/§6) before persisting, so the SEP-2350 step-up union
  still recovers prior grants after a process restart when the
  authorization server omits `scope` because granted == requested.
  Applied to both the authorization-code and refresh response handlers.

- Clear cached `oauth_metadata` when the SEP-2352 issuer-mismatch block
  discards bound credentials, so a subsequent ASM-discovery failure for
  the new authorization server cannot leave the old server's
  registration/token endpoints in place for Step 4.

- Re-evaluate the SEP-2352 issuer-binding check against
  `oauth_metadata.issuer` after ASM discovery on the legacy no-PRM path
  (PRM discovery failed, AS found via root well-known fallback),
  mirroring the existing stamping-side fallback so stale credentials are
  still discarded when the binding can only be learned post-ASM.
@maxisbey maxisbey marked this pull request as ready for review June 20, 2026 21:36
Comment on lines 591 to 599
logger.debug("Authorization server changed; discarding bound credentials and re-registering")
self.context.client_info = None
self.context.clear_tokens()
# Any cached AS metadata is for the old server; drop it so a failed
# rediscovery cannot leak the old registration/token endpoints into Step 4.
self.context.oauth_metadata = None

asm_discovery_urls = build_oauth_authorization_server_metadata_discovery_urls(
self.context.auth_server_url, self.context.server_url

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 When the issuer changes but ASM discovery for the new AS fails in the same flow, Step 4 still falls back to urljoin(resource-origin, '/register') and stamps client_information.issuer = auth_server_url (the new AS) on credentials actually registered with the old embedded AS — persisting exactly the mis-bound record the oauth_metadata = None clear is meant to prevent, and credentials_match_issuer will then accept it on every later 401 with no auto-recovery. Consider failing the flow (or skipping the issuer stamp / the legacy /register fallback) when the issuer changed but the new AS's metadata could not be discovered.

Extended reasoning...

The residual gap. The new self.context.oauth_metadata = None (oauth2.py:594-596) correctly stops a stale discovered registration_endpoint from being reused after an issuer change. But it only changes where the fallback registration goes, not the fact that a registration performed against an endpoint never derived from the new issuer is still stamped with the new issuer. If ASM discovery for the new AS fails in the same flow, Step 4 registers at the resource origin's legacy /register and binds that record to the new issuer, persisting it via storage.set_client_info before token exchange.

Concrete walkthrough (PRM-success branch, single 401 flow):

  1. Stored client_info is bound to issuer A — the resource server's old embedded AS at https://api.example.com/. The server has migrated: PRM now advertises AS B (https://new-as.example.com/), so auth_server_url = B (oauth2.py:576).
  2. Lines 584-596: credentials_match_issuer fails → client_info/tokens discarded, oauth_metadata cleared. So far so good.
  3. Step 2 ASM discovery for B fails — handle_auth_metadata_response returns (True, None) for 404s and (False, None) for non-4xx (utils.py:223-233), so the loop exits with oauth_metadata still None; no exception. The new post-ASM re-check (lines 619-633) is gated on auth_server_url is None, so it does not run here.
  4. Step 4: bound_issuer = auth_server_url = B (line 649). With oauth_metadata is None, should_use_client_metadata_url is False and create_client_registration_request (utils.py:284-287) falls back to urljoin("https://api.example.com", "/register") — in the single-origin-migration case, the old embedded AS, which still serves a live DCR endpoint.
  5. Registration succeeds against the old AS, but client_information.issuer = B is stamped (line 674) and persisted via storage.set_client_info before the authorization/token exchange, so the mis-bound record lands in storage even if the rest of the flow fails. (The same flow then also hits /authorize and /token on the resource origin via the legacy fallbacks in _perform_authorization_code_grant / _get_token_endpoint, despite the resource explicitly advertising B.)
  6. On every later 401, once B's metadata is discoverable again, credentials_match_issuer(client_info, B, ...) is a simple string compare against the stamped issuer (utils.py:341-345) and returns True — so the SDK keeps presenting AS A's client_id to AS B indefinitely with no automatic re-registration. That sticky wedge is exactly what SEP-2352 binding was meant to auto-recover from, and the PR description's second bullet says this change prevents "persisting a mis-bound record".

Why existing safeguards don't catch it. The metadata clear stops the old AS's discovered registration_endpoint from leaking into Step 4, but the legacy origin-/register fallback reaches the same old AS in the common case where the old AS lived at the resource origin; and bound_issuer is taken from auth_server_url regardless of whether the registration endpoint was actually derived from that issuer.

On the pre-existing/scope objection. It's true that both ingredients (the origin-/register fallback and the bound_issuer = auth_server_url stamping from #2933) predate this PR, that the trigger is narrow (an AS migration plus complete well-known discovery failure for the new AS plus a still-live legacy /register at the resource origin, all in one flow), and that this PR strictly narrows the window rather than widening it. This is filed as a non-blocking nit for that reason. It still seems worth a comment because the PR touches this exact block and its stated goal for this change is to stop persisting a mis-bound record when rediscovery for the new AS fails — this is a second route to that same record that remains open.

Suggested fix. When the issuer changed (the line-591 block fired) but oauth_metadata is still None after Step 2, either fail the flow with an OAuthFlowError, or skip the legacy /register fallback / skip stamping issuer so a registration not derived from the new AS's metadata is never recorded as bound to it. Any of these keeps the auto-recovery property intact: a later 401 with working discovery would re-register cleanly instead of being wedged on the wrong client_id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant