Billing for Deepseek v4 Pro on Azure AI foundry is 4.5x to what's present in Deepseek docs.

Anirudh Reddy Rachamalla 0 Reputation points
2026-06-24T13:52:40.46+00:00

Model deployment: DeepSeek-V4-Pro

Billing period checked: 2026-06-15 through 2026-06-21

Observed effective rates (from Cost Management Query API, INR converted at approximately INR 85 per USD): -
Input: INR 166.5 per 1M tokens ~= USD 1.96 per 1M - Output: INR 333.0 per 1M tokens ~= USD 3.92 per 1M

DeepSeek's published rates for the same model (deepseek-v4-pro, https://api-docs.deepseek.com): -
Input (cache miss): USD 0.435 per 1M - Input (cache hit): USD 0.003625 per 1M - Output: USD 0.87 per 1M

The billed rate is approximately 4.5x DeepSeek's published rate on both input and output. Can anyone please confirm if this is expected? I was expecting it to be 1:1

Volume context: approximately 116M tokens per week, equating to roughly INR 100K per month at current rates.

Foundry Models
Foundry Models

A catalog of AI models in Microsoft Foundry that you can discover, compare, and deploy using Azure’s built‑in tools for evaluation, fine‑tuning, and inference


2 answers

Sort by: Most helpful
  1. SRILAKSHMI C 19,550 Reputation points Microsoft External Staff Moderator
    2026-06-25T08:24:32.0366667+00:00

    Hello @Anirudh Reddy Rachamalla

    Thank you for reaching out to Microsoft Q&A and for providing the detailed billing analysis.

    I understand your concern regarding the effective token pricing you're observing for DeepSeek-V4-Pro in Azure AI Foundry. Based on your calculations, the effective rates appear to be approximately:

    • Input Tokens: ~USD 1.96 per 1M tokens

    Output Tokens: ~USD 3.92 per 1M tokens

    compared to the pricing published directly by DeepSeek:

    Input (Cache Miss): USD 0.435 per 1M tokens

    Input (Cache Hit): USD 0.003625 per 1M tokens

    Output: USD 0.87 per 1M tokens

    This results in an effective cost approximately 4–5 times higher than the rates published on DeepSeek's website.

    Understanding the pricing difference

    One important consideration is that the pricing published by DeepSeek applies to requests made directly against DeepSeek's own platform and API endpoints.

    When DeepSeek models are consumed through Azure AI Foundry, the service is delivered as a Microsoft-managed Azure offering and is billed according to Azure AI Foundry pricing meters rather than the model provider's direct pricing structure.

    Azure-hosted model offerings may include platform capabilities such as:

    Azure-hosted infrastructure and compute resources

    Enterprise security and compliance controls

    Microsoft Entra ID authentication and RBAC integration

    Azure networking and governance capabilities

    Monitoring, diagnostics, and cost management integration

    Azure support and service availability commitments

    Because of these additional platform services, pricing for third-party models hosted through Azure AI Foundry may differ from the pricing published by the original model provider.

    At this time, we cannot definitively confirm that the approximately 4.5x difference you are observing is expected for your specific deployment.

    The publicly available documentation does not provide:

    A documented pricing multiplier between DeepSeek's native API pricing and Azure-hosted DeepSeek pricing.

    Details regarding any Azure-specific markup or cost adjustments.

    Information on whether token accounting differs from DeepSeek's direct platform.

    Clarification on cache-hit versus cache-miss billing behavior within Azure AI Foundry.

    For this reason, while a pricing difference between Azure AI Foundry and DeepSeek's direct API pricing is generally expected, we cannot validate the precise multiplier based solely on the information currently available.

    Additional factors that may affect the comparison

    When comparing Azure charges against DeepSeek's published pricing, it is important to verify that both calculations are based on equivalent metrics, including:

    Cache-hit versus cache-miss token usage

    Input versus output token accounting

    Deployment type (Serverless, Managed, Provisioned, etc.)

    Azure region hosting the deployment

    Currency conversion methodology

    Azure billing agreement or negotiated pricing

    Cost Management aggregation and reporting intervals

    For example, DeepSeek publishes separate rates for cached and non-cached requests. If Azure AI Foundry does not expose the same cache-based billing model for this deployment type, a direct comparison may not be equivalent.

    Please refer this

    Azure AI Foundry Pricing: https://azure.microsoft.com/pricing/details/ai-foundry/

    DeepSeek in Azure AI Foundry: https://learn.microsoft.com/azure/ai-foundry/how-to/deploy-models-deepseek

    I Hope this helps. Do let me know if you have any further queries.


    If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

    Thank you!

    Was this answer helpful?

    0 comments No comments

  2. Alex Burlachenko 23,170 Reputation points MVP Volunteer Moderator
    2026-06-24T14:28:51.7766667+00:00

    Anirudh Reddy Rachamalla hi, thx for sharing urs issue here at Q&A portal,

    Yep, that looks expected for Azure pricing. Azure AI Foundry pricing doesn’t have to match DeepSeek’s direct API price 1:1. It’s a separate marketplace/provider route, with Azure hosting, billing, networking, compliance, support, regional/data-zone setup, etc baked in. The numbers u calculated are actually very close to Microsoft’s public Foundry price for DeepSeek V4 Pro. Azure shows DeepSeek-V4 Pro around $1.74 input / $3.48 output per 1M tokens for Global, and Fireworks-hosted DeepSeek V4 Pro Data Zone around $1.925 input / $3.828 output per 1M tokens. So ur effective $1.96 / $3.92 lines up w/ Azure, not w/ DeepSeek direct API pricing. So the mismatch is prob not a billing bug. It’s more likely that u are comparing DeepSeek direct API rates vs Azure AI Foundry rates.

    Worth checking the exact meter name in Cost Management too. If it says Fireworks / Data Zone / Foundry model meter, then those Azure rates are the ones that apply. Cache-hit pricing from DeepSeek’s own API docs may not carry over unless Azure explicitly exposes the same cached-input meter for that model/deployment.

    If cost is the main issue, best move is compare Global vs Data Zone pricing, check if a cheaper DeepSeek SKU like V4 Flash works for ur workload, or talk to Azure sales/support for committed-use / enterprise pricing. But I wouldn’t expect Azure to bill DeepSeek direct API prices automatically.

    rgds,

    Alex

    &

    If my answer was helpful pls mark it and additional thx if u follow me at Q&A portal
    

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.