Unable to deploy GPT-4o / GPT-4.1 models in any US region — replacements also marked as deprecating

Wang Wang 5 Reputation points
2026-06-29T14:31:40.8166667+00:00

Hi,

We are no longer able to deploy gpt-4o or gpt-4.1 models in any capacity in US regions. It also appears that the nano/mini variants are deprecated as well.

We are receiving the following error in Azure OpenAI Foundry:

ServiceModelDeprecating: The model 'Format:OpenAI,Name:gpt-4o,Version:2024-11-20' is in deprecating state and cannot be used for new deployments.

Those GPT models were previously communicated as being deprecated in October, but they appear to have entered the deprecating state today instead.

What is especially confusing is that Microsoft documentation suggests moving to models such as gpt-4.1-mini, but those models also seem to be unavailable for new deployments and return the same type of error:

ServiceModelDeprecating: The model 'Format:OpenAI,Name:gpt-4.1-mini,Version:2025-04-14' is in deprecating state and cannot be used for new deployments.

This creates a difficult situation, because the recommended replacement models in the documentation also appear to be deprecated: https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule

Our business relies heavily on these models, and currently there do not appear to be equivalent alternatives available in US regions at a comparable price point.

For example, gpt-5 does not appear to be available under Standard deployment in the US regions we checked, unlike GPT-4o previously, which results in a significant cost increase — roughly 20x higher for our use case.

Is anyone able to clarify:

  • Whether this is expected behavior across all US regions
  • Which currently supported models are the intended replacements for gpt-4o and gpt-4.1-mini
  • Whether there is any Standard deployment alternative in US regions at a similar pricing level
  • Whether the documentation will be updated to reflect actual availability

Any guidance would be greatly appreciated, as this is affecting production planning for our business.

Thanks.

Azure OpenAI in Foundry Models
0 comments No comments

2 answers

Sort by: Most helpful
  1. Wang Wang 5 Reputation points
    2026-06-30T10:10:23.9133333+00:00

    @Sangeetha Kesavan Xnip Helper 2026-06-30 11.48.34

    Hi Sangeetha,

    Thank you for the detailed reply. However, there appears to be a contradiction I'd like to clarify.

    Your response states that gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) are in the "Deprecating" state, which is why new deployments are blocked.

    But the current official retirement schedule clearly lists both of these specific versions with a Lifecycle status of "GA", not "Deprecated" or "Deprecating":

    • gpt-4o (2024-11-20) → GA, retires 2026-10-01
    • gpt-4o-mini (2024-07-18) → GA, retires 2026-10-01

    (Screenshot from https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule attached)

    Per Microsoft's own definition, a GA model should be available for new Standard deployments until it transitions to Deprecated. These versions are not scheduled for retirement until October 2026, yet the system is returning a ServiceModelDeprecating error today.

    Could you please clarify:

    1. Why are GA-status models being blocked from new deployments well ahead of their documented retirement date?
    2. Is this a backend/configuration issue, since it directly contradicts the published lifecycle status?
    3. If this is intentional, when will the documentation be corrected to reflect the actual state?

    We need this clarified because our deployment planning relies on the documented GA status. Thank you.Hi Sangeetha,

    Thank you for the detailed reply. However, there appears to be a contradiction I'd like to clarify.

    Your response states that gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) are in the "Deprecating" state, which is why new deployments are blocked.

    But the current official retirement schedule clearly lists both of these specific versions with a Lifecycle status of "GA", not "Deprecated" or "Deprecating":

    • gpt-4o (2024-11-20) → GA, retires 2026-10-01
    • gpt-4o-mini (2024-07-18) → GA, retires 2026-10-01

    (Screenshot from https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule attached)

    Per Microsoft's own definition, a GA model should be available for new Standard deployments until it transitions to Deprecated. These versions are not scheduled for retirement until October 2026, yet the system is returning a ServiceModelDeprecating error today.

    Could you please clarify:

    1. Why are GA-status models being blocked from new deployments well ahead of their documented retirement date?
    2. Is this a backend/configuration issue, since it directly contradicts the published lifecycle status?
    3. If this is intentional, when will the documentation be corrected to reflect the actual state?

    *We need this clarified because our deployment planning relies on the documented GA status. Thank you.*Hi Sangeetha,

    Thank you for the detailed reply. However, there appears to be a contradiction I'd like to clarify.

    Your response states that gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) are in the "Deprecating" state, which is why new deployments are blocked.

    But the current official retirement schedule clearly lists both of these specific versions with a Lifecycle status of "GA", not "Deprecated" or "Deprecating":

    • gpt-4o (2024-11-20) → GA, retires 2026-10-01
    • gpt-4o-mini (2024-07-18) → GA, retires 2026-10-01

    (Screenshot from https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule attached)

    Per Microsoft's own definition, a GA model should be available for new Standard deployments until it transitions to Deprecated. These versions are not scheduled for retirement until October 2026, yet the system is returning a ServiceModelDeprecating error today.

    Could you please clarify:

    1. Why are GA-status models being blocked from new deployments well ahead of their documented retirement date?
    2. Is this a backend/configuration issue, since it directly contradicts the published lifecycle status?
    3. If this is intentional, when will the documentation be corrected to reflect the actual state?

    We need this clarified because our deployment planning relies on the documented GA status. Thank you.Hi Sangeetha,

    Thank you for the detailed reply. However, there appears to be a contradiction I'd like to clarify.

    Your response states that gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) are in the "Deprecating" state, which is why new deployments are blocked.

    But the current official retirement schedule clearly lists both of these specific versions with a Lifecycle status of "GA", not "Deprecated" or "Deprecating":

    • gpt-4o (2024-11-20) → GA, retires 2026-10-01
    • gpt-4o-mini (2024-07-18) → GA, retires 2026-10-01

    (Screenshot from https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule attached)

    Per Microsoft's own definition, a GA model should be available for new Standard deployments until it transitions to Deprecated. These versions are not scheduled for retirement until October 2026, yet the system is returning a ServiceModelDeprecating error today.

    Could you please clarify:

    1. Why are GA-status models being blocked from new deployments well ahead of their documented retirement date?
    2. Is this a backend/configuration issue, since it directly contradicts the published lifecycle status?
    3. If this is intentional, when will the documentation be corrected to reflect the actual state?

    We need this clarified because our deployment planning relies on the documented GA status. Thank you.

    Was this answer helpful?

    0 comments No comments

  2. Sangeetha Kesavan 20 Reputation points
    2026-06-29T17:03:45.79+00:00

    Hello Wang Wang,

    Greetings! Thanks for raising this question in the Q&A forum.

    The ServiceModelDeprecating error you are seeing is expected behavior, but the timing is earlier than the documentation previously indicated. Here is what is happening and what your options are.

    When a model enters the "Deprecating" state in Microsoft Foundry, it means new deployments of that model version are blocked even though existing deployments continue to serve inference until the actual retirement date. Deprecation means the model is no longer available for new customers, but it continues to be available for use by customers who have existing deployments until the model is retired. The retirement dates for the models you mentioned are confirmed in the current retirement schedule as follows:

    • gpt-4o (2024-11-20): GA, retires October 1, 2026, replacement listed as gpt-5.1
    • gpt-4.1-mini (2025-04-14): GA, retires October 14, 2026
    • gpt-4.1 (2025-04-14): GA, retires October 14, 2026

    The fact that new deployments are being blocked today, ahead of those retirement dates, indicates these model versions have been moved into the deprecating stage. Legacy and Deprecated transitions follow the published timeline and are visible in real time via the Models API. This is by design — the deprecating stage gates new deployments while allowing existing ones to continue running.

    To answer your specific questions directly:

    Is this expected across all US regions? Yes. The deprecating state is applied globally at the model version level, not per region. If gpt-4o (2024-11-20) and gpt-4.1-mini (2025-04-14) are in the deprecating state, no new Standard deployments of those versions will succeed in any region.

    What are the intended replacements? Per the current retirement schedule, the documented replacement for gpt-4o is gpt-5.1. For gpt-4.1-mini, there is no separate replacement listed in the schedule, which means the intent is migration to the gpt-5 model family more broadly.

    Is there a Standard deployment alternative at a comparable price point? Yes, and this is important. The gpt-5 family includes cost tiers that are comparable to or cheaper than gpt-4o and gpt-4.1-mini. The most cost-efficient option, GPT-5-nano, costs $0.05 per million input tokens and $0.40 per million output tokens, while the flagship GPT-5 model runs $1.25/$10.00 per million tokens. For comparison, gpt-4o was priced at $2.50 input and $10.00 output. So gpt-5-mini and gpt-5-nano are substantially cheaper than gpt-4o, not more expensive. The cost concern you raised about gpt-5 likely applies to the flagship gpt-5 model, not to the mini and nano variants. Please verify current rates on the Azure pricing page as they are updated frequently: https://azure.microsoft.com/en-us/pricing/details/azure-openai/

    To proceed, follow these steps:

    Deploy gpt-5-mini or gpt-5-nano as replacements for gpt-4.1-mini — Navigate to your Azure AI Foundry resource, go to Deployments, and select Deploy model. Search for gpt-5-mini (version 2025-08-07) or gpt-5-nano (version 2025-08-07) and create a Standard deployment. gpt-5-mini and gpt-5-nano are both listed as GA with retirement dates of 2027-02-06. These are the cost-tier equivalents of the mini/nano model class and are available on Standard SKU.

    Deploy gpt-5.1 as the replacement for gpt-4o — Per the official retirement schedule, gpt-5.1 (version 2025-11-13) is the documented replacement for gpt-4o across US regions. When gpt-4o versions 2024-05-13 and 2024-08-06 retired on March 31, 2026, they were auto-upgraded to gpt-5.1 on the Standard SKU, which was added to all eight US regions that previously had those gpt-4o versions including centralus, eastus, eastus2, northcentralus, southcentralus, westus, and westus3. Try deploying gpt-5.1 in those same regions first.

    Try Global Standard if regional Standard shows no availability — If a specific model does not appear in your regional Standard deployment list, switch the deployment type to Global Standard. Global Standard has the broadest model availability and will route requests across Azure's global pool. This resolves most "model not available in this region" situations without requiring a region change.

    Check your quota tier for newer models — Some newer GPT-5 series models require a quota tier upgrade before they appear as deployable. In Azure AI Foundry, go to Quotas and check whether gpt-5-mini, gpt-5-nano, or gpt-5.1 have allocated quota in your subscription. If not, submit a quota increase request through the Quotas blade for the relevant model and region.

    Validate against the region availability matrix — For the definitive current list of which models are available in which regions and under which deployment type, consult the region availability reference: https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/models

    Raise a support ticket if gpt-5.1 Standard is not available in your subscription — If gpt-5.1 does not appear in your deployment options despite being the documented auto-upgrade replacement for gpt-4o, open a support request under Issue type: Technical, Service: Azure OpenAI, and describe that the documented replacement model for deprecated gpt-4o is not available for new Standard deployments in your subscription and region. Include your subscription ID, region, and the specific model versions you attempted to deploy.

    If this answer helps you kindly accept the answer which will help others who have similar questions.

    Best Regards,

    Sangeetha K

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.