A catalog of AI models in Microsoft Foundry that you can discover, compare, and deploy using Azure’s built‑in tools for evaluation, fine‑tuning, and inference
Hello @Anirudh Reddy Rachamalla
Thank you for reaching out to Microsoft Q&A and for providing the detailed billing analysis.
I understand your concern regarding the effective token pricing you're observing for DeepSeek-V4-Pro in Azure AI Foundry. Based on your calculations, the effective rates appear to be approximately:
- Input Tokens: ~USD 1.96 per 1M tokens
Output Tokens: ~USD 3.92 per 1M tokens
compared to the pricing published directly by DeepSeek:
Input (Cache Miss): USD 0.435 per 1M tokens
Input (Cache Hit): USD 0.003625 per 1M tokens
Output: USD 0.87 per 1M tokens
This results in an effective cost approximately 4–5 times higher than the rates published on DeepSeek's website.
Understanding the pricing difference
One important consideration is that the pricing published by DeepSeek applies to requests made directly against DeepSeek's own platform and API endpoints.
When DeepSeek models are consumed through Azure AI Foundry, the service is delivered as a Microsoft-managed Azure offering and is billed according to Azure AI Foundry pricing meters rather than the model provider's direct pricing structure.
Azure-hosted model offerings may include platform capabilities such as:
Azure-hosted infrastructure and compute resources
Enterprise security and compliance controls
Microsoft Entra ID authentication and RBAC integration
Azure networking and governance capabilities
Monitoring, diagnostics, and cost management integration
Azure support and service availability commitments
Because of these additional platform services, pricing for third-party models hosted through Azure AI Foundry may differ from the pricing published by the original model provider.
At this time, we cannot definitively confirm that the approximately 4.5x difference you are observing is expected for your specific deployment.
The publicly available documentation does not provide:
A documented pricing multiplier between DeepSeek's native API pricing and Azure-hosted DeepSeek pricing.
Details regarding any Azure-specific markup or cost adjustments.
Information on whether token accounting differs from DeepSeek's direct platform.
Clarification on cache-hit versus cache-miss billing behavior within Azure AI Foundry.
For this reason, while a pricing difference between Azure AI Foundry and DeepSeek's direct API pricing is generally expected, we cannot validate the precise multiplier based solely on the information currently available.
Additional factors that may affect the comparison
When comparing Azure charges against DeepSeek's published pricing, it is important to verify that both calculations are based on equivalent metrics, including:
Cache-hit versus cache-miss token usage
Input versus output token accounting
Deployment type (Serverless, Managed, Provisioned, etc.)
Azure region hosting the deployment
Currency conversion methodology
Azure billing agreement or negotiated pricing
Cost Management aggregation and reporting intervals
For example, DeepSeek publishes separate rates for cached and non-cached requests. If Azure AI Foundry does not expose the same cache-based billing model for this deployment type, a direct comparison may not be equivalent.
Please refer this
Azure AI Foundry Pricing: https://azure.microsoft.com/pricing/details/ai-foundry/
DeepSeek in Azure AI Foundry: https://learn.microsoft.com/azure/ai-foundry/how-to/deploy-models-deepseek
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thank you!