CosmosDB vCore M10 - Node storage 42GB with only ~300MB actual MongoDB data, growing daily — cannot inspect internals

Dipesh Khaiju 40 Reputation points
2026-06-08T04:40:39.85+00:00

Environment:

  • Service: Azure Cosmos DB for MongoDB (vCore)
  • Tier: M10, currently on 64GB storage allocation
  • Region: South East Asia
  • Single shard cluster

Problem:

Azure Metrics → Storage Used → Split by ServerName shows

~42GB used on the primary node, growing approximately

500 MB per day with no significant changes to application data.

Actual MongoDB data verified via db.stats() and per-collection

stats in MongoDB Compass:

Database 1 (main application DB): ~300MB across 14 collections

Total logical MongoDB data: ~300MB

Azure node-level storage reported: ~42GB

Unexplained gap: ~41.7GB and growing daily

Collection breakdown (anonymized):

  • Largest collection: ~106MB, ~11,600 docs
  • Second largest: ~58MB, ~1,300 docs
  • Third largest: ~40MB, ~1,300 docs
  • Remaining 11: ~100MB combined
  • No GridFS collections (fs.files / fs.chunks are empty)
  • No logging collections
  • No binary/file data stored in documents

History:

  • Storage was 30GB when we first noticed, grew to 38GB, now 42GB.
  • Hit OutOfDiskSpace error (code 14031) despite only ~300MB MongoDB data at the time
  • Upgraded storage to 64GB as emergency fix — cluster recovered immediately
  • Storage has continued growing since: 30GB → 38GB → 42GB over approximately 2-3 weeks

Diagnostics attempted:

  • db.stats() and per-collection stats → confirms ~300MB real data
  • db.compact() on all collections → no effect on Azure storage
  • Checked all databases including local → empty
  • Checked oplog → empty
  • Audited application code → no accidental writes, no file storage in MongoDB, no logging to DB
  • Dropped unused empty databases → no effect on storage counter

Commands blocked on vCore (expected but limits diagnosis):

  • db.serverStatus() → MongoServerError: CommandNotSupported
  • Cannot inspect WiredTiger cache, journal stats, or checkpoint internals

Questions:

  1. What is consuming ~41.7GB at the node level when MongoDB logical data is only ~300MB?
  2. Microsoft docs confirm vCore storage includes "database files, temporary files, transaction logs, and database server logs" — is this normal for these components to consume 140x the actual data size?
  3. Is this WiredTiger journal/checkpoint accumulation that failed to clean up? If so, can it be cleared from the backend?
  4. Will this continue growing indefinitely or will it stabilize at some point?
  5. Since storage can only be increased (not decreased) on vCore, is migrating to a fresh cluster the only way to reclaim this space? Or can Azure engineering intervene on the existing node?
  6. Could the same issue recur on a fresh cluster given the same workload?

User's image

User's image

Azure Cosmos DB
Azure Cosmos DB

An Azure NoSQL database service for app development.

0 comments No comments

Answer accepted by question author

Pilladi Padma Sai Manisha 10,770 Reputation points Microsoft External Staff Moderator
2026-06-10T05:37:22.8133333+00:00

Hi Dipesh,

Thank you for providing the detailed investigation and screenshots.

Based on the information shared, there is a noticeable difference between the logical MongoDB data size (~300 MB reported through db.stats() and collection statistics) and the node-level storage consumption (~42 GB) reported by Azure Metrics.

In Azure Cosmos DB for MongoDB (vCore), the Storage Used metric reflects overall storage consumption at the node level and is not limited to collection data alone. As documented, storage usage can include database files, temporary files, transaction logs, server logs, and other service-managed components in addition to the logical data stored in collections.

Because Cosmos DB for MongoDB (vCore) is a managed service, low-level storage engine diagnostics such as serverStatus() are not exposed. As a result, it is not possible from the customer side to determine the exact breakdown of the additional storage consumption.

The key observation here is that storage usage continues to grow over time despite the logical data size remaining relatively small and after maintenance operations such as compact(). This behavior warrants further investigation to determine whether the growth is related to internal storage allocation, retained service-managed files, or another backend condition.

To help narrow this down, could you please share the following details?

  • Is the storage growth observed only on the primary node, or across all nodes/replicas?
  • Approximately when did the OutOfDiskSpace (14031) error first occur?
  • Are there any recent scaling operations, failovers, maintenance events, or workload changes that coincide with the start of the growth trend?
  • What are the current values for Storage Percent and IOPS during the period of growth?

If the growth continues and cannot be correlated with application data growth, I would recommend opening a support request so the engineering team can review the backend storage allocation and identify the specific components contributing to the node-level disk usage.

Relevant documentation:

I look forward to your update.

Was this answer helpful?

1 person found this answer helpful.

Answer accepted by question author

Amira Bedhiafi 43,036 Reputation points MVP Volunteer Moderator
2026-06-08T20:45:58.0333333+00:00

Hello Dipesh !

Thank you for posting on MS Learn Q&A.

Some difference between MongoDB logical data and Azure node level storage is expected because vCore storage is not only collection data. The provisioned storage is used for database files, temporary files, transaction logs and database server logs but a jump from around 300–350 MB visible MongoDB data to 42 GB node storage, growing by about 500 MB/day, is too large to treat as normal overhead without backend investigation.

https://learn.microsoft.com/en-us/azure/documentdb/compute-storage

The key point is that db.stats(), collection stats and listDatabases.totalSize only prove that the visible MongoDB databases/collections are small. In your screenshot, totalSize: 370783023 is only around 354 MB, which confirms that the 42 GB is probably outside the visible collection data/index footprint or possibly a node level metric or accounting issue.

db.compact() not changing the Azure metric is also important and compact command targets collection or index storage by releasing obsolete blocks back to the OS when possible and it does not clean platform server logs, transaction logs, temporary files or service managed internal files.

The compaction effectiveness depends on how much releasable space exists and where it exists in the data files.

https://www.mongodb.com/docs/manual/reference/command/compact/

I recommend you to open an Azure support ticket.

Was this answer helpful?

1 person found this answer helpful.
0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.