An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hi @YangQi
Thank you for reaching out to Microsoft Q&A.
From your description, I understand that you're using the Azure OpenAI Realtime API over WebSocket for live speech transcription and translation, and you're intermittently receiving Unicode replacement characters (U+FFFD, displayed as �) or mojibake-like text (for example, �果) in both Chinese and Japanese transcripts and translations.
Based on your findings:
The issue occurs in both:
-
session.input_transcript.delta -
session.output_transcript.delta
The behavior is reproducible using both:
- Direct browser-to-Azure WebSocket connections.
- Browser → backend WebSocket bridge → Azure OpenAI.
You've verified that the decrypted WebSocket payload already contains the corrupted characters, indicating that the corruption is present before the client renders the text.
The issue is intermittent and not tied to a specific sentence or phrase.
Thank you for also confirming that you've ruled out client-side rendering by inspecting the decrypted WebSocket payload. That is a very helpful diagnostic step.
Based on the available Azure OpenAI documentation, we do not have any guidance that specifically addresses U+FFFD replacement characters or mojibake appearing within Realtime API transcription or translation events, particularly when the decrypted WebSocket payload itself already contains the corrupted text.
The available documentation primarily covers:
General Realtime API endpoint and event behavior.
API version compatibility.
UTF-8 encoding considerations for Batch (.jsonl) files.
Other unrelated text encoding scenarios.
It does not describe this specific behavior for Realtime speech transcription or translation over WebSocket.
Although the documentation does not directly address this issue, you may want to verify the following:
Confirm the Realtime API endpoint and API version
Ensure you're using the supported GA (v1) Realtime API endpoint and the appropriate API version for your scenario.
Some Realtime API documentation notes that certain preview API versions did not support all expected streaming delta events. While your issue relates to character corruption rather than missing events, it's still worth confirming that your client is using the recommended endpoint and API version.
Verify UTF-8 handling throughout the pipeline
While the documented UTF-8 guidance applies primarily to Batch (.jsonl) workflows rather than WebSocket streaming, it's still worth verifying that every component in your processing pipeline consistently uses UTF-8 encoding without intermediate character set conversions.
Based on your investigation, you've already confirmed that the decrypted WebSocket payload contains the corrupted characters, making a frontend rendering issue less likely. However, validating consistent UTF-8 handling across the browser, backend bridge, logging mechanism, and any intermediate processing remains a useful diagnostic step.
Please refer this
Getting started with Azure OpenAI batch deployments (troubleshooting, UTF-8-BOM): https://learn.microsoft.com/azure/ai-foundry/openai/how-to/batch?wt.mc_id=knowledgesearch_inproduct_azure-cxp-community-insider#troubleshooting
Transparency note for Azure OpenAI (speech-to-text/translation limitations): https://learn.microsoft.com/azure/foundry/responsible-ai/openai/transparency-note?wt.mc_id=knowledgesearch_inproduct_azure-cxp-community-insider#limitations
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thank you!