[bot] OpenAI and Anthropic generation parameters not captured in span metadata

## Summary

The OpenAI and Anthropic request taggers in `InstrumentationSemConv` only extract `model` and request routing info (`request_path`, `request_base_uri`, `request_method`) into span metadata. Generation parameters like `temperature`, `max_tokens`, `tools`, `response_format`, and provider-specific config like Anthropic's `thinking` are silently dropped.

In contrast, the Google GenAI handler in the same repo (`BraintrustApiClient.tagSpan()`) extracts `temperature`, `topP`, `topK`, `maxOutputTokens`, `tools`, `toolConfig`, `safetySettings`, `responseMimeType`, `responseSchema`, and more into metadata. This is an inconsistency within the repo — the Google GenAI handler provides materially more instrumentation detail for the same class of information.

## What is missing

### OpenAI — `tagOpenAIRequest()` (lines 78–108)

Currently captures only `model` in metadata. The following request parameters are silently dropped:

| Field | Purpose |
|---|---|
| `temperature` | Sampling temperature |
| `max_tokens` / `max_completion_tokens` | Output length limit |
| `top_p` | Nucleus sampling |
| `frequency_penalty`, `presence_penalty` | Repetition control |
| `tools` | Tool/function definitions |
| `response_format` | Structured output config (JSON mode, JSON Schema) |
| `reasoning_effort` | Reasoning effort for o-series models |
| `logprobs`, `top_logprobs` | Log probability settings |
| `stop` | Stop sequences |

For the Responses API, additional fields are missing: `instructions`, `tools` (with web_search, file_search, code_interpreter configs), `reasoning` (with `effort` and `summary`).

### Anthropic — `tagAnthropicRequest()` (lines 163–205)

Currently captures only `model` in metadata. The following request parameters are silently dropped:

| Field | Purpose |
|---|---|
| `max_tokens` | Output length limit (required parameter) |
| `temperature` | Sampling temperature |
| `top_p`, `top_k` | Sampling parameters |
| `tools` | Tool definitions |
| `thinking` | Extended thinking config (`type`, `budget_tokens`) |
| `stop_sequences` | Stop sequences |
| `metadata` | User-provided metadata (e.g., `user_id`) |

The `thinking` config is particularly important: when a user enables extended thinking with a specific budget, this information is lost from the span. The Python SDK equivalent (issue braintrustdata/braintrust-sdk-python#107, now closed) captures thinking metadata.

### Google GenAI — already captures these (for comparison)

`BraintrustApiClient.tagSpan()` (lines 64–95) extracts into metadata:
- `systemInstruction`, `tools`, `toolConfig`, `safetySettings`, `cachedContent`
- From `generationConfig`: `temperature`, `topP`, `topK`, `candidateCount`, `maxOutputTokens`, `stopSequences`, `responseMimeType`, `responseSchema`

## Impact

Users viewing traces in Braintrust can see generation parameters for Google GenAI calls but not for OpenAI or Anthropic calls. This makes it impossible to understand model configuration from the trace alone for the two most commonly used providers.

## Braintrust docs status

- Braintrust tracing docs at https://www.braintrust.dev/docs/guides/tracing state that LLM spans show "the model, messages, parameters, token usage, and cost" — **supported** (parameters are documented as captured, but not implemented for OpenAI/Anthropic in this Java SDK)
- The Braintrust OpenAI docs mention temperature handling for GPT-5 models, implying it is a tracked parameter: **unclear**

## Upstream sources

- **OpenAI Chat Completions API**: https://platform.openai.com/docs/api-reference/chat/create — documents all request parameters including `temperature`, `max_tokens`, `tools`, `response_format`, `reasoning_effort`
- **OpenAI Responses API**: https://platform.openai.com/docs/api-reference/responses/create — documents `instructions`, `tools`, `reasoning`
- **Anthropic Messages API**: https://docs.anthropic.com/en/api/messages — documents `max_tokens`, `temperature`, `tools`, `thinking`, `top_p`, `top_k`, `stop_sequences`
- **Anthropic extended thinking**: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking — documents `thinking` parameter with `type` and `budget_tokens`

## Local files inspected

- `braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java` — lines 78–108 (`tagOpenAIRequest`: only `model` extracted), lines 163–205 (`tagAnthropicRequest`: only `model` extracted)
- `braintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java` — lines 64–95 (`tagSpan`: comprehensive parameter extraction into metadata)
- `braintrust-sdk/instrumentation/openai_2_8_0/src/test/java/dev/braintrust/instrumentation/openai/v2_8_0/BraintrustOpenAITest.java` — no test asserts generation parameters in metadata
- `braintrust-sdk/instrumentation/anthropic_2_2_0/src/test/java/dev/braintrust/instrumentation/anthropic/v2_2_0/BraintrustAnthropicTest.java` — no test asserts generation parameters in metadata

Field	Purpose
`max_tokens`	Output length limit (required parameter)
`temperature`	Sampling temperature
`top_p`, `top_k`	Sampling parameters
`tools`	Tool definitions
`thinking`	Extended thinking config (`type`, `budget_tokens`)
`stop_sequences`	Stop sequences
`metadata`	User-provided metadata (e.g., `user_id`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bot] OpenAI and Anthropic generation parameters not captured in span metadata #83

Summary

What is missing

OpenAI — `tagOpenAIRequest()` (lines 78–108)

Anthropic — `tagAnthropicRequest()` (lines 163–205)

Google GenAI — already captures these (for comparison)

Impact

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field	Purpose
`temperature`	Sampling temperature
`max_tokens` / `max_completion_tokens`	Output length limit
`top_p`	Nucleus sampling
`frequency_penalty`, `presence_penalty`	Repetition control
`tools`	Tool/function definitions
`response_format`	Structured output config (JSON mode, JSON Schema)
`reasoning_effort`	Reasoning effort for o-series models
`logprobs`, `top_logprobs`	Log probability settings
`stop`	Stop sequences

[bot] OpenAI and Anthropic generation parameters not captured in span metadata #83

Description

Summary

What is missing

OpenAI — tagOpenAIRequest() (lines 78–108)

Anthropic — tagAnthropicRequest() (lines 163–205)

Google GenAI — already captures these (for comparison)

Impact

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

OpenAI — `tagOpenAIRequest()` (lines 78–108)

Anthropic — `tagAnthropicRequest()` (lines 163–205)