An Azure service that integrates speech processing into apps and services.
Azure Voice Live STT: How to enforce a hard per-session language lock (es-ES only)?
Jurado, Jose Luis
0
Reputation points
Issue context:
We are using Azure Voice Live realtime STT in production voice sessions and need strict monolingual behavior per session.
Observed behavior:
- We send session.update with input_audio_transcription.language set to es-ES.
-
- We also set turn_detection.type to azure_semantic_vad_multilingual.
-
- Despite language=es-ES, STT still detects/transcribes other languages when speakers switch language.
- Expected behavior:
-
- A hard language lock so STT only recognizes/transcribes Spanish (es-ES) for that session.
- Current payload example:
- {
- "type": "session.update",
- "session": {
-
"input_audio_transcription": { -
"model": "azure-speech", -
"language": "es-ES" -
}, -
"turn_detection": { -
"type": "azure_semantic_vad_multilingual", -
"threshold": 0.5, -
"prefix_padding_ms": 500, -
"silence_duration_ms": 1400, -
"barge_in": true -
} - }
- }
- Question:
-
- Is strict hard language lock supported today for Voice Live STT per session?
-
- If yes, what exact parameter(s) and API version enforce it?
-
- If not, what is the recommended workaround and roadmap?
- Any official guidance is appreciated.
Azure Speech in Foundry Tools
Azure Speech in Foundry Tools
Sign in to answer