gpt-realtime 2.0 on Azure OpenAI stops emitting audio mid-response (audio output truncates); identical setup works on OpenAI's direct API

Abdul Rehman 65 Reputation points
2026-06-30T08:23:58.7266667+00:00
When using the gpt-realtime 2.0 model deployed via Azure OpenAI, the model intermittently stops producing audio output mid-session — it behaves as though it simply stops talking. The same application code, prompts, and audio configuration work correctly against OpenAI's direct Realtime API; the failure only occurs against the Azure-hosted deployment. This points to a difference in the Azure hosting/serving layer rather than in our client or in the model itself.

Expected behavior:
The model streams a complete audio response for each turn (response.audio.delta events through to response.audio.done / response.done with status: "completed"), matching the behavior we observe on OpenAI's direct API.

Actual behavior:
Mid-session, audio output stops. The model "goes silent" and no further audio is produced for that turn (or the turn ends prematurely). This recurs multiple times within a single session. See the attached recording — there are repeated silent stretches of roughly 1.5–3.8 seconds where expected audio is missing.

Environment

Model / deployment: gpt-realtime 2.0 (Azure deployment name: [OnlimInternalTesting-production-manual])
Azure OpenAI resource region: [francecentral]
Audio format: PCM16, 24 kHz, mono (input and output)
Comparison baseline: OpenAI direct Realtime API, same model family, same client code
Azure OpenAI in Foundry Models

1 answer

Sort by: Most helpful
  1. Sina Salam 30,486 Reputation points Volunteer Moderator
    2026-06-30T13:14:29.73+00:00

    Hello Abdul Rehman,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that gpt-realtime-2 on Azure OpenAI is intermittently stopping audio output mid-response, while the same application, prompt, and PCM16 24 kHz mono audio setup works correctly against OpenAI’s direct Realtime API.

    In addition to @SRILAKSHMI C

    The most reliable fix is to run the Azure Realtime integration on the GA /openai/v1 Realtime endpoint, pass the Azure deployment name as the model value, and use WebRTC for live user-facing audio. Then reproduce the issue while logging all Realtime events. If Azure stops emitting response.audio.delta and either sends response.done as completed with missing audio or never sends response.audio.done / response.done while the connection remains open, open a Microsoft Azure support case with the trace and correlation IDs because that indicates an Azure service-side streaming defect rather than a normal client implementation issue.

    Use the below resource links for more reading and implementation steps:

    I hope this is helpful. Please! Do not hesitate to let me know if you have any other questions, steps or clarifications.


    Please do not close the thread by upvoting and accepting the answer if any part of it is helpful.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.