An Azure service that integrates speech processing into apps and services.
Hi James Morgan,
Thanks for the exceptionally clean repro — the paired request IDs and the "remove one block → 200" contrast make this easy to reason about. Here's where it lands.
Root cause
Your payload is valid, and the value you're passing for input_audio_transcription.model (gpt-4o-mini-transcribe-sweden-dev-v2) is a deployment name — which is exactly what Azure OpenAI requires. Azure deviates from OpenAI here and expects the deployment name, not a raw model ID like whisper-1. So that part is correct.
Because the identical call returns 200 the moment the input_audio_transcription block is removed, this is a service-side (500) failure in the /client_secrets mint path when that block is present — not a validation, feature-gating, or configuration problem on your side. There's no documented region-specific prerequisite or feature flag for input_audio_transcription beyond using a supported transcription model referenced by deployment name.
Immediate workaround (unblocks you now)
Configure transcription after the connection instead of at token-mint time:
- Mint the ephemeral token without
input_audio_transcription(this returns 200). - Open the Realtime WebSocket/WebRTC connection using that token.
- Send a
session.updateevent that enables transcription:
{
"type": "session.update",
"session": {
"type": "realtime",
"input_audio_transcription": { "model": "gpt-4o-mini-transcribe-sweden-dev-v2" }
}
}
The server replies session.updated with the transcription model applied, and input-transcription events start flowing. This post-connect path is confirmed working for this exact scenario, so it's a reliable way to proceed while the mint-time issue is addressed.
Is this a bug?
Effectively yes — it's a backend fault in the /client_secrets mint path, not something you can fix by changing config:
- The payload is valid and the deployment-name usage is correct.
- Removing one optional block flips 500 → 200.
- The same transcription configuration is accepted cleanly on the post-connect
session.updatepath — proving the content is supported; only the mint-time handling faults.
There's precedent for transient service-side 500s on Sweden Central transcribe models that Microsoft mitigated on the backend, so it's also worth confirming whether this is persistent or intermittent right now.
To get this investigated (please share)
- Is the 500 100% reproducible right now, or intermittent?
- The exact JSON of the failing
/client_secretsrequest (with and without the block), plus 2–3 fresh failingapim-request-ids + UTC timestamps (you've already givenc0d6e5c5-6e40-4039-a4e1-cd21f985cae1). - Does it also fail with API-key auth (to rule out anything Managed-Identity-specific)?
- Does the same call succeed in another region (e.g., East US 2) with an equivalent transcription deployment?
- Confirm you're on the GA endpoint without an
api-versionquery param.
If the 500 persists with a valid payload and a supported transcription deployment, let us know the details requested over Private message.
Please give the post-connect workaround a try and let me know if it unblocks your flow.
If this helps, consider marking it as accepted so others hitting the same 500 can find it.