feat(lib): add get_model_capabilities for runtime parameter routing#3124
feat(lib): add get_model_capabilities for runtime parameter routing#3124SinhSinhAn wants to merge 2 commits intoopenai:mainfrom
Conversation
Applications that support multiple OpenAI models (gpt-4.1, gpt-5.x,
o-series, etc.) currently maintain prefix checks like
`model.startswith("gpt-5")` to decide which parameters to send. These
checks break every time a new family launches, and the version-based
logic for reasoning_effort levels is reverse-engineered from
documentation rather than provided by the SDK.
Add a hand-curated capability registry under `openai.lib._models`
exposing two public symbols:
- `ModelCapabilities`: a frozen dataclass with `family`,
`supports_temperature`, `supports_reasoning`, and
`reasoning_effort_options` fields.
- `get_model_capabilities(model)`: looks up a model identifier and
returns the matching `ModelCapabilities`, or `None` if unknown.
The lookup is date-suffix-aware (gpt-5.4-mini-2026-03-17 -> gpt-5.4),
recognises size variants (-mini, -nano, -pro), and treats *-chat-latest
and *-search-preview models as non-reasoning chat variants.
Both symbols are re-exported from `openai` and `openai.lib`.
Includes 35 unit tests covering each family, edge cases (unknown
models, non-string inputs, frozen instances), and the realistic
"route temperature/effort decisions by model" use case from the
issue. helpers.md documents the new public surface.
The registry lives in `src/openai/lib/`, which CONTRIBUTING.md
designates as the area Stainless will not regenerate.
Closes openai#3073
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3b79e58bf6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # beats "o1". | ||
| best: Optional[ModelCapabilities] = None | ||
| for entry in _FAMILIES: | ||
| if not candidate.startswith(entry.family): |
There was a problem hiding this comment.
Enforce family boundary in prefix matching
Using candidate.startswith(entry.family) without a separator check causes unknown model IDs to be treated as known families, which breaks the documented None fallback path. For example, gpt-5.10 currently matches gpt-5.1, and typos like o1-previewed match o1-preview, so callers may route parameters based on incorrect capabilities instead of treating the model as unknown.
Useful? React with 👍 / 👎.
| # Effort scales reused across families. | ||
| _EFFORT_O_SERIES: Tuple[ReasoningEffort, ...] = ("low", "medium", "high") | ||
| _EFFORT_GPT5: Tuple[ReasoningEffort, ...] = ("minimal", "low", "medium", "high") | ||
| _EFFORT_GPT5_1: Tuple[ReasoningEffort, ...] = ("none", "minimal", "low", "medium", "high") |
There was a problem hiding this comment.
Align gpt-5.1 effort options with API-supported values
The gpt-5.1 capability tuple includes "minimal", but the generated API docs in this repo (src/openai/types/shared/reasoning.py, lines 27-29) state gpt-5.1 supports none, low, medium, and high. Returning minimal here can lead consumers to render/send an unsupported effort value and trigger avoidable 400 responses.
Useful? React with 👍 / 👎.
Two P1 corrections from the automated Codex review on PR openai#3124: 1. Enforce segment boundary in prefix matching. `startswith()` was matching unknown identifiers like `gpt-5.10` as `gpt-5.1` and `o1-previewed` as `o1-preview`, causing the documented `None` fallback to be skipped. The lookup now requires the family prefix to either equal the model exactly or be followed by a `-`. 2. Align gpt-5 family effort options with the canonical SDK docs in `src/openai/types/shared/reasoning.py`: - gpt-5.1 (and -codex / -codex-max / -mini): `none/low/medium/high` -- removed `minimal`, which would have surfaced an unsupported value and triggered avoidable 400 responses. - gpt-5.2 / gpt-5.3 / gpt-5.4: `none/low/medium/high/xhigh` (`xhigh` is supported for models *after* gpt-5.1-codex-max). - gpt-5-pro: split into its own family with the high-only effort scale per the SDK docstring ("defaults to and only supports high"). Other improvements: - Chat / search variant detection now matches dated snapshots like `gpt-4o-search-preview-2025-03-11` via a regex instead of a strict `endswith` check. - New `TestBoundaryMatching` class with parametrized regression tests for prefix-collision cases (`gpt-5.10`, `gpt-5.1foo`, `o1-previewed`) and dated chat / search variants. - New `TestGpt5ProFamily` class covering the high-only effort scale. 49 of 49 new tests pass; 154 of 154 lib tests pass overall.
|
Both P1 issues are addressed in 2171fb7: P1 #1 — Family boundary in prefix matching. Replaced the bare get_model_capabilities("gpt-5.10") # None
get_model_capabilities("gpt-5.1foo") # None
get_model_capabilities("o1-previewed") # falls through to broader "o1" family, not "o1-preview"P1 #2 — gpt-5.1 effort options. Re-read
While I was in the file, I also addressed two related issues raised by the same logic chain:
Added a |
Summary
Closes #3073.
Add a hand-curated model capability registry so applications that support multiple OpenAI models can decide which parameters to send (and which UI controls to render) without prefix-matching model strings themselves. Lives in
src/openai/lib/, whichCONTRIBUTING.mddesignates as the area Stainless will not regenerate.Background
#3073 (filed by @pamelafox while migrating azure-search-openai-demo to the Responses API) describes the current pain: applications targeting multiple model families maintain hardcoded checks like
These break every time a new family launches, and the version-based logic for
reasoning_effortlevels is reverse-engineered from documentation rather than provided by the SDK.What this PR adds
ModelCapabilities: a frozen dataclass withfamily,supports_temperature,supports_reasoning, andreasoning_effort_optionsfields.get_model_capabilities(model: str) -> ModelCapabilities | None: looks up a model identifier and returns the matching capabilities, orNoneif the model is not registered.Both symbols are re-exported from
openai(matching thepydantic_function_toolpattern) and fromopenai.lib.Lookup behaviour
"gpt-5.4"beats"gpt-5", and"o1-pro"beats"o1"."gpt-5.4-mini-2026-03-17"resolves to thegpt-5.4family.-mini,-nano,-proare handled automatically by prefix matching (no special-casing).*-chat-latestand*-search-previewmodels override their family defaults to(temperature: True, reasoning: False). Thefamilyfield still reports the underlying family so callers can group variants together.None. Callers should treat that as "fall back to the API's own validation" — send your default parameters and handle a 400 if the model rejects them.Effort scales encoded
low,medium,highminimal,low,medium,highnone,minimal,low,medium,highnone,minimal,low,medium,high,xhighThis matches the canonical
ReasoningEffortliteral intypes/shared/reasoning_effort.py.Files changed
src/openai/lib/_models.py(new, 200 lines): the registry and lookup implementationsrc/openai/lib/__init__.py: re-export the public symbolssrc/openai/__init__.py: re-export at the top level (matchingpydantic_function_tool)tests/lib/test_model_capabilities.py(new, 35 tests): family-by-family coverage, edge cases, and a realistic "route parameters by model" usage scenariohelpers.md: documentation for the public surfaceTest plan
pytest tests/lib/— 140 tests pass (35 new + 105 existing)ruff checkandruff format --checkclean on touched filespyrightclean on touched filesmypyreports no new errors on touched files (12 pre-existing errors in unrelated files)python -c "import openai"worksgpt-4.1-mini-> supports temperature, no reasoninggpt-5-> rejects temperature, effortminimal/low/medium/highgpt-5.1-> addsnonegpt-5.4-> addsxhighDesign notes
I considered exposing a stringly-typed API like
openai.models.supports(model, "temperature")(one of the alternatives in the issue), but the dataclass approach is more Pythonic, plays better with type checkers, and avoids adding string-keyed parameter names to the public surface that would themselves drift over time. Field access is also strictly more discoverable in IDEs.The registry is hand-maintained (not derived from a server-side source) for the same reason
_old_api.pyand the Azure helpers are: capability metadata is documented behaviour, not OpenAPI schema. New families need a corresponding entry in_FAMILIES. I kept the surface narrow (4 fields) to minimise that maintenance burden — vision/audio/structured-output capabilities can be added later without breaking existing callers thanks to the dataclass shape.