Subject: GA-status models (gpt-4o 2024-11-20 / gpt-4o-mini 2024-07-18) are blocked for new deployments — this directly contradicts Microsoft's own documentation

Wang Wang 0 Reputation points
2026-06-30T10:37:55.37+00:00

We are unable to create new deployments of gpt-4o (2024-11-20) or gpt-4o-mini (2024-07-18) in any capacity, and the error returned directly contradicts Microsoft's own published lifecycle documentation. We need clarification, because this is impacting a production workload and our migration planning.

Here is the contradiction laid out precisely, with references to the official docs.


1. The blocked models are documented as GA — not Deprecated.

Per the official Model retirement schedule (https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule?view=foundry-classic), the exact versions we are trying to deploy are listed as:

Model Version Lifecycle Retirement date
gpt-4o 2024-11-20 GA 2026-10-01
gpt-4o 2024-11-20 GA 2026-10-01
gpt-4o-mini 2024-07-18 GA 2026-10-01

These are GA, not "Deprecated" and not "Legacy." Their retirement date is October 2026 — nearly 3 months away.


2. Microsoft's lifecycle policy states GA models can be deployed by new customers.

Per the Model lifecycle and support policy (https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirements?view=foundry-classic), the lifecycle stage table explicitly states:

  • Generally Available (GA)"Can create new deployments? Yes"
  • Only the Deprecated stage blocks new customers, and that stage begins "at 12 months from launch."

By Microsoft's own rules, a GA model must be deployable by new customers until it transitions to Deprecated. These versions have not transitioned — the schedule still lists them as GA.


3. The error we receive maps to the Deprecated stage — not GA.

The lifecycle documentation contains this API-to-portal terminology mapping:

API lifecycleStatus: "Deprecating" = portal/docs stage "Deprecated" = "Still serves inference. Blocked for new customers."

The error returned is:

ServiceModelDeprecating: The model 'Format:OpenAI,Name:gpt-4o,Version:2024-11-20' is in deprecating state and cannot be used for new deployments.

This means the backend is reporting these versions as Deprecated (blocked for new customers) — while the public retirement schedule simultaneously lists them as GA.

Both states cannot be true at once. Either:

  • (a) the backend is incorrectly flagging GA models as Deprecated — a bug; or
  • (b) the documentation is wrong and has not been updated to reflect the real state.

Either way, customers are making production decisions based on a GA status that does not reflect reality.


4. If this is regional "new customer" limiting, it needs to be stated transparently.

The lifecycle policy includes the clause: "Microsoft can limit new customers in specific regions to maintain service quality for existing customers."

If that clause is being applied here, it is not reflected anywhere in the GA status shown in the retirement schedule. Showing a model as "GA" while silently blocking new deployments in a region is not acceptable transparency for a production service.


5. The documented replacement chain is itself broken.

The retirement schedule lists the replacement for gpt-4o-mini as gpt-4.1-mini — but gpt-4.1-mini is also marked Deprecated (retires 2026-10-14) and is also blocked with the same ServiceModelDeprecating error:

ServiceModelDeprecating: The model 'Format:OpenAI,Name:gpt-4.1-mini,Version:2025-04-14' is in deprecating state and cannot be used for new deployments.

Recommending a deprecated, non-deployable model as the official replacement for another model makes it impossible to plan a migration based on the schedule.


What we need clarified:

  1. Are gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) being treated as Deprecated (existing-customers-only) at the backend, despite being published as GA?
  2. If yes — under what clause, and why is the public schedule not updated to reflect it?
  3. If this is "new customer limiting in specific regions," please confirm it in writing and state which regions are affected.
  4. Is there any path for a subscription to gain new-deployment access to these GA-listed versions, or has access been revoked ahead of the documented retirement date?
  5. When will the schedule be corrected so it no longer shows misleading GA status for versions that cannot actually be deployed?
  6. Given that the listed replacement (gpt-4.1-mini) is itself deprecated and blocked, what is the actual deployable replacement at a comparable price point?

This is affecting a live production workload. We would appreciate this being escalated to the relevant product team if the GA-vs-Deprecated discrepancy cannot be resolved directly.

Thank you.We are unable to create new deployments of gpt-4o (2024-11-20) or gpt-4o-mini (2024-07-18) in any capacity, and the error returned directly contradicts Microsoft's own published lifecycle documentation. We need clarification, because this is impacting a production workload and our migration planning.

Here is the contradiction laid out precisely, with references to the official docs.


1. The blocked models are documented as GA — not Deprecated.

Per the official Model retirement schedule (https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule?view=foundry-classic), the exact versions we are trying to deploy are listed as:

Model Version Lifecycle Retirement date
gpt-4o 2024-11-20 GA 2026-10-01
gpt-4o-mini 2024-07-18 GA 2026-10-01

These are GA, not "Deprecated" and not "Legacy." Their retirement date is October 2026 — nearly a year away.


2. Microsoft's lifecycle policy states GA models can be deployed by new customers.

Per the Model lifecycle and support policy (https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirements?view=foundry-classic), the lifecycle stage table explicitly states:

  • Generally Available (GA)"Can create new deployments? Yes"
  • Only the Deprecated stage blocks new customers, and that stage begins "at 12 months from launch."

By Microsoft's own rules, a GA model must be deployable by new customers until it transitions to Deprecated. These versions have not transitioned — the schedule still lists them as GA.


3. The error we receive maps to the Deprecated stage — not GA.

The lifecycle documentation contains this API-to-portal terminology mapping:

API lifecycleStatus: "Deprecating" = portal/docs stage "Deprecated" = "Still serves inference. Blocked for new customers."

The error returned is:

ServiceModelDeprecating: The model 'Format:OpenAI,Name:gpt-4o,Version:2024-11-20' is in deprecating state and cannot be used for new deployments.

This means the backend is reporting these versions as Deprecated (blocked for new customers) — while the public retirement schedule simultaneously lists them as GA.

Both states cannot be true at once. Either:

  • (a) the backend is incorrectly flagging GA models as Deprecated — a bug; or
  • (b) the documentation is wrong and has not been updated to reflect the real state.

Either way, customers are making production decisions based on a GA status that does not reflect reality.


4. If this is regional "new customer" limiting, it needs to be stated transparently.

The lifecycle policy includes the clause: "Microsoft can limit new customers in specific regions to maintain service quality for existing customers."

If that clause is being applied here, it is not reflected anywhere in the GA status shown in the retirement schedule. Showing a model as "GA" while silently blocking new deployments in a region is not acceptable transparency for a production service.


5. The documented replacement chain is itself broken.

The retirement schedule lists the replacement for gpt-4o-mini as gpt-4.1-mini — but gpt-4.1-mini is also marked Deprecated (retires 2026-10-14) and is also blocked with the same ServiceModelDeprecating error:

ServiceModelDeprecating: The model 'Format:OpenAI,Name:gpt-4.1-mini,Version:2025-04-14' is in deprecating state and cannot be used for new deployments.

Recommending a deprecated, non-deployable model as the official replacement for another model makes it impossible to plan a migration based on the schedule.


What we need clarified:

  1. Are gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) being treated as Deprecated (existing-customers-only) at the backend, despite being published as GA?
  2. If yes — under what clause, and why is the public schedule not updated to reflect it?
  3. If this is "new customer limiting in specific regions," please confirm it in writing and state which regions are affected.
  4. Is there any path for a subscription to gain new-deployment access to these GA-listed versions, or has access been revoked ahead of the documented retirement date?
  5. When will the schedule be corrected so it no longer shows misleading GA status for versions that cannot actually be deployed?
  6. Given that the listed replacement (gpt-4.1-mini) is itself deprecated and blocked, what is the actual deployable replacement at a comparable price point?

This is affecting a live production workload. We would appreciate this being escalated to the relevant product team if the GA-vs-Deprecated discrepancy cannot be resolved directly.

Thank you.

Foundry Models
Foundry Models

A catalog of AI models in Microsoft Foundry that you can discover, compare, and deploy using Azure’s built‑in tools for evaluation, fine‑tuning, and inference

0 comments No comments

1 answer

Sort by: Most helpful
  1. Sina Salam 30,486 Reputation points Volunteer Moderator
    2026-06-30T15:28:56.6066667+00:00

    Hello Wang Wang,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are unable to create new Azure OpenAI deployments for gpt-4o 2024-11-20 and gpt-4o-mini 2024-07-18, even though the public Microsoft model retirement schedule still lists them as GA.

    This is Azure OpenAI backend lifecycle enforcement returning ServiceModelDeprecating, which means the selected model/version is being treated as blocked for new deployments in that subscription, region, or deployment type. This is not an SDK issue, portal issue, or your code issue. Also, you're right to challenge the contradiction. The lifecycle policy says GA models can create new deployments, while Deprecated models block new customers and allow existing deployments to continue; therefore, the observed behavior is a service-side lifecycle/availability mismatch that must be handled by migration plus Microsoft Support escalation. - https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirements, https://learn.microsoft.com/en-us/azure/foundry/openai/concepts/model-retirement-schedule.

    The best practice I can advise you are the followings:

    • Do not delete existing working deployments of gpt-4o 2024-11-20 or gpt-4o-mini 2024-07-18.
    • Do not continue retrying the same blocked model/version, because ServiceModelDeprecating cannot be bypassed client-side.
    • Validate live model availability in the target Azure region and subscription.
    • Migrate gpt-4o 2024-11-20 workloads to gpt-5.1 where available, because Microsoft lists gpt-5.1 as the replacement for gpt-4o.
    • Do not rely on gpt-4.1-mini as the replacement for gpt-4o-mini if it is blocked, because the current official schedule also lists gpt-4.1-mini 2025-04-14 as Deprecated.
    • Use a currently deployable GA mini-class replacement, such as gpt-5-mini or gpt-5-nano, after confirming live availability in the required region and deployment type.
    • Open an Azure Support case via your Azure Portal or use https://learn.microsoft.com/en-us/azure/azure-portal/supportability/priority-community-support to confirm whether the root cause is backend metadata, regional restriction, subscription-level eligibility, or documentation mismatch.

    Use the below official resources for more reading:

    I hope this is helpful. Please accept the answer if it resolves the issue so the thread can help others facing the same.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.