We are currently investigating a spike in Azure Foundry usage observed during the month of May-2026 !!!

Question

We are currently investigating a spike in Azure Foundry usage observed during the month of May-2026 !!!

Partho Sarkar 5

Hi,

We are investigating an unexpected usage spike for an Azure Foundry / Azure OpenAI endpoint during May-2026 billing period. We are trying to determine the root cause of the increased usage.

We have already reviewed the built-in Azure metrics, such as request count, token usage, latency, and response codes. These confirm the spike but do not provide enough request-level detail for root cause analysis.

We need to retrieve historical request-level logs or backend telemetry for an Azure OpenAI / Foundry endpoint when diagnostic logging was not enabled during this period. Where available and permitted, we are looking for the following can be retrieved through Azure Support or any other mechanism:

Request timestamps
Model deployment used
Application or caller identity metadata
Source region, network origin, or IP-related metadata
Request path and operation metadata
Request status, errors, retries, throttling, or abnormal usage patterns
Token usage at request level
Request/response payload details, if available and compliant with Azure data privacy policies

We understand that payload-level logs may be restricted due to privacy, compliance, retention and diagnostic logging limitations. Any guidance on how we can get the historical data may still be available and the correct support path to request it, would be appreciated.

Thanks in advance.

Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-17T00:47:20.1133333+00:00
Hello @Partho Sarkar

Thank you for your question.

If diagnostic logging was not enabled during the period when the usage spike occurred, Azure OpenAI/Azure AI Foundry does not provide a way to retroactively retrieve detailed request-level logs such as caller identity, source IP, request payloads, or per-request token consumption through the Azure portal.

To help investigate the spike, please reviewing the following:

Azure Monitor metrics (Requests, Tokens, Latency, Errors, Throttling) to identify when the increase began and whether it correlates with any operational anomalies.

Azure Cost Analysis and billing reports to determine which resources, deployments, or regions contributed most to the increase.

Azure AI Foundry deployments and quota configuration to verify whether any deployment, model version, routing, or quota changes were made around the time the spike started. [github.com]

Activity Logs to check for control-plane changes such as deployment creation, updates, or key rotation.

If the available telemetry is insufficient for root-cause analysis, the recommended next step is to open an Azure Support request and provide:

Subscription ID

Resource name

Region

Exact timeframe (UTC)

Details of the observed spike

The support team can review whether any additional backend telemetry is available for investigation, subject to Microsoft's privacy, security, and retention policies. However, the public documentation does not guarantee the availability of historical request-level data when diagnostic logging was not enabled.

For future investigations, consider enabling Diagnostic Settings and sending logs to Log Analytics. This provides much richer visibility into usage patterns and operational events.

I hope this helps clarify the available options. Do let me know if you have any further queries.

Thankyou!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-17T20:09:07.8433333+00:00

Hello @Partho Sarkar

Did you get any chance to review the response.

Thankyou!
Partho Sarkar 5 Reputation points

2026-06-17T22:41:30.91+00:00

Hi @Anshika Varshney ,

Thanks for the prompt and detailed response. This is helpful.

We have already investigated Azure Monitor, Azure Cost Analysis and Azure AI Foundry. Based on that, we now have details regarding the specific endpoints, deployed models and timelines associated with the usage spikes.

At this stage, we are looking for deeper request-level details, especially request/response output or any backend telemetry that may help us further investigate the root cause.

Diagnostic logging was not enabled during this period, which is why we are reaching out to understand whether any additional backend-level telemetry may still be available through Microsoft/Azure Support, subject to applicable privacy, security and retention policies. Diagnostic logging has now been enabled for the resource and logs are being sent to Log Analytics for future analysis.

Please let us know if there is any alternate support path or escalation mechanism to request Microsoft/Azure Support to check for any retained backend telemetry for the affected period.

Thank again !!!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-18T21:34:51.98+00:00
Hello @Partho Sarkar

Thank you for the update and for sharing the additional details.

Since diagnostic logging was not enabled during the period when the usage spike occurred, the request/response payloads and detailed inference logs would generally not be available within your subscription for retrospective analysis. Azure Monitor and Azure AI Foundry diagnostic logs can only capture data from the time logging is enabled onward.

Troubleshooting steps for ongoing monitoring

Since diagnostic logging has now been enabled, we recommend the following:

Verify that all Azure AI Foundry diagnostic categories are being sent to Log Analytics.

Monitor token consumption, request counts, and model-specific usage trends.

Correlate activity with application logs, API Management logs, App Service logs, or other client-side telemetry sources.

Review authentication activity and service principals accessing the resource to identify unexpected usage patterns.

Configure Azure Monitor alerts for unusual spikes in token usage, requests, or costs.

Regularly review Cost Analysis and Cost Management reports to identify anomalous spending early.

If multiple applications share the same Foundry resource, consider separating workloads across deployments or resources to improve attribution.

Based on the information provided, pursuing a Microsoft Support investigation through your Azure point of contact remains the most appropriate path to determine whether any relevant backend telemetry is still available for the affected timeframe.

Please keep us posted on the outcome, and we'll be happy to assist further if additional information becomes available.

Thankyou!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-19T22:01:09.4933333+00:00

Hello @Partho Sarkar

Did you get any chance to review the response.

Thankyou!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-22T18:51:21.43+00:00

Hello @Partho Sarkar

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

Thankyou!
Partho Sarkar 5 Reputation points

2026-06-22T20:39:04.7233333+00:00

Hi @Anshika Varshney ,

I had just shared my response in comment to Alex's response below. Sharing it again below.

The preventive steps you mentioned make sense, and I have already shared them with our team to ensure we have the right diagnostic logging, monitoring and tracing in place going forward.

For this specific case, we are trying to retrieve as much historical detail or backend telemetry as possible for the affected May 2026 period, even if full request/response payloads are not available. The challenge we are facing now is that we are currently unable to raise the Azure Support request. It keeps redirecting us and does not allow us to complete the ticket creation. At best, I was able to post it here in Microsoft Q&A.

It would be helpful if you could guide us on how to proceed or escalate this so that a support case can be created.

Thank you !!!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-23T21:12:35.8266667+00:00

Hello @Partho Sarkar

Thank you for bringing this to our attention. I have escalated this issue to our internal team and are currently reviewing the details. The team is actively working on the investigation, and we will update you as soon as we have more information.

Thank you for your patience.

1 answer

Your answer

Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-17T00:47:20.1133333+00:00

Hello @Partho Sarkar

Thank you for your question.

If diagnostic logging was not enabled during the period when the usage spike occurred, Azure OpenAI/Azure AI Foundry does not provide a way to retroactively retrieve detailed request-level logs such as caller identity, source IP, request payloads, or per-request token consumption through the Azure portal.

To help investigate the spike, please reviewing the following:

Azure Monitor metrics (Requests, Tokens, Latency, Errors, Throttling) to identify when the increase began and whether it correlates with any operational anomalies.

Azure Cost Analysis and billing reports to determine which resources, deployments, or regions contributed most to the increase.

Azure AI Foundry deployments and quota configuration to verify whether any deployment, model version, routing, or quota changes were made around the time the spike started. [github.com]

Activity Logs to check for control-plane changes such as deployment creation, updates, or key rotation.

If the available telemetry is insufficient for root-cause analysis, the recommended next step is to open an Azure Support request and provide:

Subscription ID

Resource name

Region

Exact timeframe (UTC)

Details of the observed spike

The support team can review whether any additional backend telemetry is available for investigation, subject to Microsoft's privacy, security, and retention policies. However, the public documentation does not guarantee the availability of historical request-level data when diagnostic logging was not enabled.

For future investigations, consider enabling Diagnostic Settings and sending logs to Log Analytics. This provides much richer visibility into usage patterns and operational events.

I hope this helps clarify the available options. Do let me know if you have any further queries.

Thankyou!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-17T20:09:07.8433333+00:00

Hello @Partho Sarkar

Did you get any chance to review the response.

Thankyou!
Partho Sarkar 5 Reputation points

2026-06-17T22:41:30.91+00:00

Hi @Anshika Varshney ,

Thanks for the prompt and detailed response. This is helpful.

We have already investigated Azure Monitor, Azure Cost Analysis and Azure AI Foundry. Based on that, we now have details regarding the specific endpoints, deployed models and timelines associated with the usage spikes.

At this stage, we are looking for deeper request-level details, especially request/response output or any backend telemetry that may help us further investigate the root cause.

Diagnostic logging was not enabled during this period, which is why we are reaching out to understand whether any additional backend-level telemetry may still be available through Microsoft/Azure Support, subject to applicable privacy, security and retention policies. Diagnostic logging has now been enabled for the resource and logs are being sent to Log Analytics for future analysis.

Please let us know if there is any alternate support path or escalation mechanism to request Microsoft/Azure Support to check for any retained backend telemetry for the affected period.

Thank again !!!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-18T21:34:51.98+00:00

Hello @Partho Sarkar

Thank you for the update and for sharing the additional details.

Since diagnostic logging was not enabled during the period when the usage spike occurred, the request/response payloads and detailed inference logs would generally not be available within your subscription for retrospective analysis. Azure Monitor and Azure AI Foundry diagnostic logs can only capture data from the time logging is enabled onward.

Troubleshooting steps for ongoing monitoring

Since diagnostic logging has now been enabled, we recommend the following:

Verify that all Azure AI Foundry diagnostic categories are being sent to Log Analytics.

Monitor token consumption, request counts, and model-specific usage trends.

Correlate activity with application logs, API Management logs, App Service logs, or other client-side telemetry sources.

Review authentication activity and service principals accessing the resource to identify unexpected usage patterns.

Configure Azure Monitor alerts for unusual spikes in token usage, requests, or costs.

Regularly review Cost Analysis and Cost Management reports to identify anomalous spending early.

If multiple applications share the same Foundry resource, consider separating workloads across deployments or resources to improve attribution.

Based on the information provided, pursuing a Microsoft Support investigation through your Azure point of contact remains the most appropriate path to determine whether any relevant backend telemetry is still available for the affected timeframe.

Please keep us posted on the outcome, and we'll be happy to assist further if additional information becomes available.

Thankyou!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-19T22:01:09.4933333+00:00

Hello @Partho Sarkar

Did you get any chance to review the response.

Thankyou!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-22T18:51:21.43+00:00

Hello @Partho Sarkar

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

Thankyou!
Partho Sarkar 5 Reputation points

2026-06-22T20:39:04.7233333+00:00

Hi @Anshika Varshney ,

I had just shared my response in comment to Alex's response below. Sharing it again below.

The preventive steps you mentioned make sense, and I have already shared them with our team to ensure we have the right diagnostic logging, monitoring and tracing in place going forward.

For this specific case, we are trying to retrieve as much historical detail or backend telemetry as possible for the affected May 2026 period, even if full request/response payloads are not available. The challenge we are facing now is that we are currently unable to raise the Azure Support request. It keeps redirecting us and does not allow us to complete the ticket creation. At best, I was able to post it here in Microsoft Q&A.

It would be helpful if you could guide us on how to proceed or escalate this so that a support case can be created.

Thank you !!!
Anshika Varshney 14,085 Reputation points Microsoft External Staff Moderator

2026-06-23T21:12:35.8266667+00:00

Hello @Partho Sarkar

Thank you for bringing this to our attention. I have escalated this issue to our internal team and are currently reviewing the details. The team is actively working on the investigation, and we will update you as soon as we have more information.

Thank you for your patience.

Answer 1

Partho Sarkar hi & thx for sharing urs issue here at Q&A portal,

u won’t be able to reconstruct full request-level history if diagnostic logging wasn’t enabled during May. Azure metrics can confirm the spike, but they’re aggregated. They usually won’t give u the full per-request trail with caller identity, source IP, request path, deployment, status, and token usage after the fact, that Azure OpenAI / Foundry metrics are mainly for aggregated usage, latency, tokens, errors, and safety signals https://learn.microsoft.com/en-us/azure/foundry/openai/monitor-openai-reference for request-level logging, the safer design is to enable diagnostic settings before the incident and send logs to Log Analytics, Storage, Event Hub, or another SIEM. If it wasn’t enabled at the time, Azure Support may be able to check limited backend/service telemetry, but I wouldn’t expect them to provide full historical request logs or payloads. Payload-level data is especially unlikely because of privacy and retention rules.

I’d open a support case under Azure OpenAI / Azure AI Foundry billing or usage investigation and ask specifically whether backend telemetry exists for the May 2026 billing window for that resource ID and deployment. I’d include the resource ID, region, endpoint name, deployment names, subscription ID, billing period, UTC time range, and screenshots of the usage spike. For future RCA, I’d put the endpoint behind API Management or an app gateway layer and log caller/app identity, request timestamp, deployment name, token usage, status code, retry count, and client IP where policy allows it. https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-llm-logs

From what u described the builtin metrics are probably the best self-service historical data u have right now. Anything deeper would need Microsoft Support, and even then it may be limited.

rgds,

Alex

&

If my answer was helpful pls mark it and additional thx if u follow me at Q&A portal

Partho Sarkar 5 Reputation points

2026-06-22T18:20:12.1066667+00:00

Hi Alex,

This is helpful.

The preventive steps you mentioned make sense, and I have already shared them with our team to ensure we have the right diagnostic logging, monitoring and tracing in place going forward.

For this specific case, we are trying to retrieve as much historical detail or backend telemetry as possible for the affected May 2026 period, even if full request/response payloads are not available. The challenge we are facing now is that we are currently unable to raise the Azure Support request. It keeps redirecting us and does not allow us to complete the ticket creation. At best, I was able to post it here in Microsoft Q&A.

It would be helpful if @Anshika Varshney or you could guide us on how to proceed or escalate this so that a support case can be created.

We are also in touch with our Microsoft Azure point of contact for the same.

Thank you !!!