Persistent APIError: 1007 None (Invalid Audio Format) in google-adk 1.31.1 using Vertex AI Live API

# 1. Description
Since upgrading to google-adk v1.29.0, the Multimodal Live API (gemini-live-2.5-flash-native-audio on Vertex AI) intermittently crashes with google.genai.errors.APIError: 1007 None.

The session typically establishes correctly, but the error triggers mid-conversation during active audio streaming. The error message explicitly cites an invalid audio format: "16khz s16le pcm, mono channel", despite the client-side input remaining consistent and verified at these specifications. This appears to be a regression in how the ADK frames or sequences audio blobs under sustained load or network jitter. This was not an issue in versions prior to this. I migrated to newer version to get this fix https://github.com/google/adk-python/commit/6b1600fbf53bcf634c5fe4793f02921bc0b75125

Steps to Reproduce:

- Initialize an ADK LlmAgent using the gemini-live-2.5-flash-native-audio model on Vertex AI.
- Establish a bidirectional session using runner.run_live().
- Engage in a multi-turn conversation, providing sustained audio input (3+ minutes).
- Observe the connection drop with the 1007 None traceback during an active audio turn.

**Expected Behavior:**
The WebSocket should maintain a stable bidirectional stream. The backend should consistently validate the audio packets provided by the ADK as long as the input format (PCM 16kHz) does not change.

**Observed Behavior:**
The connection terminates mid-stream with status 1007 (Invalid Frame Payload).

Log Snippet: APIError in live flow: 1007 None. error when processing input audio, please check if the inputaudio is in valid format: 16khz s16le pcm, mono channel.; Error


**Environment Details:**

ADK Library Version: 1.29.0

Google-GenAI Version: (Check via pip show google-genai)

Python Version: 3.14.4

Model: gemini-live-2.5-flash-native-audio (Vertex AI)

Deployment: Cloud Run


**Minimal Reproduction Code:**

Python
import asyncio
import os
from fastapi import FastAPI, WebSocket
from google.adk.agents.live_request_queue import LiveRequestQueue
from google.adk.agents.run_config import RunConfig, StreamingMode
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.memory import InMemoryMemoryService
from google.adk.agents import LlmAgent
from google.genai import types

1. Minimal Agent Setup
Replace with your specific instructions/tools if necessary
mock_agent = LlmAgent(
model="gemini-live-2.5-flash-native-audio",
instructions="You are a helpful Python tutor."
)

app = FastAPI()
session_service = InMemorySessionService()
memory_service = InMemoryMemoryService()
runner = Runner(
app_name="reproduction-app",
agent=mock_agent,
session_service=session_service,
memory_service=memory_service
)

@app.websocket("/ws/{user_id}/{session_id}")
async def websocket_endpoint(websocket: WebSocket, user_id: str, session_id: str):
await websocket.accept()

# 2. Minimal RunConfig
run_config = RunConfig(
        streaming_mode=StreamingMode.BIDI,
        response_modalities=["AUDIO"], # Required for your PCM player
        input_audio_transcription=types.AudioTranscriptionConfig(language_codes=["en-GB"]),
        output_audio_transcription=types.AudioTranscriptionConfig(language_codes=["en-GB"]),
        session_resumption=types.SessionResumptionConfig(transparent=True),
        context_window_compression=types.ContextWindowCompressionConfig(
            trigger_tokens=100000,  # Start compression at ~78% of 128k context
            sliding_window=types.SlidingWindow(
                target_tokens=80000  # Compress to ~62% of context, preserving recent turns
            )
        ),
        proactivity=types.ProactivityConfig(proactive_audio=True) if proactivity else None,
        enable_affective_dialog=affective_dialog,
        speech_config=types.SpeechConfig(
            voice_config=types.VoiceConfig(
                prebuilt_voice_config=types.PrebuiltVoiceConfig(
                    voice_name=os.getenv("AGENT_VOICE", "Puck")
                )
            ),
            language_code=os.getenv("AGENT_LANGUAGE", "en-US")
        )
    )

live_request_queue = LiveRequestQueue()

# 3. Upstream: WebSocket -> Gemini
async def client_to_agent_messaging():
    try:
        while True:
            message = await websocket.receive()
            if "bytes" in message:
                audio_blob = types.Blob(mime_type="audio/pcm;rate=16000", data=message["bytes"])
                live_request_queue.send_realtime(audio_blob)
    except Exception:
        live_request_queue.close()

# 4. Downstream: Gemini -> WebSocket (Where the 1000 None occurs)
async def agent_to_client_messaging():
    try:
        async for event in runner.run_live(
            user_id=user_id,
            session_id=session_id,
            live_request_queue=live_request_queue,
            run_config=run_config,
        ):
            await websocket.send_text(event.model_dump_json(exclude_none=True))
    except Exception as e:
        print(f"CRASH DETECTED: {e}")

 done, pending = await asyncio.wait(
            [
                asyncio.create_task(client_to_agent_messaging()),
                asyncio.create_task(agent_to_client_messaging()),
            ],
            return_when=asyncio.FIRST_COMPLETED,
        )
        
        # 1. Propagate exceptions from the tasks that finished
        for task in done:
            try:
                task.result()
            except Exception as e:
                logger.error(f"Task failed with exception: {e}")
                raise e

        # 2. Cancel the remaining tasks
        for task in pending:
            task.cancel()
            try:
                await task
            except asyncio.CancelledError:
                pass
    finally:
        # Final cleanup: Just call close(), don't check for .is_closed()
        try:
            live_request_queue.close()
        except Exception:
            pass
            
        if websocket.client_state != WebSocketState.DISCONNECTED:
            try:
                await websocket.close()
            except Exception:
                pass


**How often has this issue occurred?:**

Very Frequently (60-70%+)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent APIError: 1007 None (Invalid Audio Format) in google-adk 1.31.1 using Vertex AI Live API #5552

1. Description

2. Minimal RunConfig

3. Upstream: WebSocket -> Gemini

4. Downstream: Gemini -> WebSocket (Where the 1000 None occurs)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Persistent APIError: 1007 None (Invalid Audio Format) in google-adk 1.31.1 using Vertex AI Live API #5552

Description

1. Description

2. Minimal RunConfig

3. Upstream: WebSocket -> Gemini

4. Downstream: Gemini -> WebSocket (Where the 1000 None occurs)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions