Which Azure Metrics Should Be Monitored for Azure Functions and Azure SQL Database to Prepare for Increased Traffic?

Dewa Tsukasa 100 Reputation points
2026-06-29T02:11:22.8733333+00:00

Hello,

We are operating a system built with Azure Functions and Azure SQL Database, and we expect traffic to increase in the near future.

I would like to understand which metrics should be monitored in order to determine whether resource limits are being approached as traffic increases.

Currently, we are monitoring the following metrics.

Azure Functions

  • Requests
  • Average memory working set
  • Connections

Azure SQL Database

  • SQL instance CPU percent
  • SQL instance memory percent
  • SQL Server process core percent
  • SQL Server process memory percent

I understand that each resource has service limits and scaling characteristics.

For reference:

My current assumption is that monitoring the metrics above should allow us to identify whether increased traffic is approaching resource allocation limits.

However, I would appreciate advice on whether there are additional metrics that should also be monitored to proactively detect bottlenecks or capacity issues.

Our current resource specifications are:

Azure Functions

  • Plan Type: Consumption (Windows)
  • Runtime Stack: .NET 8.0

Azure SQL Database

  • Pricing Tier: General Purpose – Serverless
  • Compute: Standard-series (Gen5), 4 vCores

Are there any recommended metrics, alerts, or monitoring practices for this configuration?

Thank you in advance for your guidance.

Azure Monitor
Azure Monitor

An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.


Answer accepted by question author

Bharath Y P 10,165 Reputation points Microsoft External Staff Moderator
2026-06-29T13:23:33.44+00:00

Hello Dewa, You are currently running Azure Functions: Consumption plan (.NET 8) and Azure SQL Database: General Purpose – Serverless (4 vCores). You have already configured baseline monitoring (requests, memory, CPU, connections). Your goal is to ensure that as traffic increases, you can:

  • Detect approaching resource limits early
  • Identify performance bottlenecks before failures
  • Monitor scaling behavior and saturation signals

the metrics that actually warn you you're approaching a limit are mostly utilization-%-vs-limit and scaling signals, which your current list is missing. I'd add:

Azure Functions (depends on your plan — Consumption/Flex/Premium scale differently):

  • Instance count (Flex: Automatic Scaling Instance Count) vs your plan's max scale-out — the real ceiling.
  • Connections vs the per-instance SNAT/connection limit (outbound SNAT port exhaustion is a classic scale failure).
  • Http5xx / FunctionErrors, Response Time, CPU Percentage, Thread pool queue length.
  • Enable Application Insights for per-function duration/dependencies/exceptions, and use Dashboards with Grafana under Monitoring. Ref: https://learn.microsoft.com/azure/azure-functions/monitor-functions-reference#metrics

Azure SQL Database (DTU vs vCore changes which apply). Your CPU/memory metrics miss the IO/log/worker/session ceilings that usually hit first:

  • DTU percentage (= max of CPU/Data IO/Log IO — best single signal, DTU model), Data IO %, Log IO percentage (write-heavy bottleneck), Workers percentage, Sessions percentage (very relevant as Functions scale out and open connections), Data space used %.
  • Alert at >80% for CPU/DTU/Log IO; Deadlocks > 1.
  • For capacity-fit decisions, query sys.resource_stats / sys.dm_db_resource_stats (avg/max CPU, IO, log, workers, sessions vs tier) and use Query Performance Insight / Database Watcher. Ref: https://learn.microsoft.com/azure/azure-sql/database/monitoring-metrics-alerts

Best practice before the traffic increase: set alerts at ~75–80% of each limit for lead time, then run a load test watching Functions instance-count-vs-max + connections and SQL DTU/Log-IO/Workers/Sessions %..

Your current monitoring configuration provides a strong baseline, but to proactively detect scaling and capacity issues, it is critical to:

  • Add latency, error rate, and scaling metrics for Azure Functions
  • Extend SQL monitoring to include IO, log, sessions, and worker limits
  • Enable Application Insights for end-to-end observability

References: https://learn.microsoft.com/azure/azure-functions/monitor-functions https://learn.microsoft.com/azure/azure-functions/monitor-functions-reference#metrics https://learn.microsoft.com/azure/azure-sql/database/monitoring-metrics-alerts?view=azuresql https://learn.microsoft.com/azure/azure-sql/database/serverless-tier-overview?view=azuresql#monitor

Hope this helps! thanks

Was this answer helpful?

1 person found this answer helpful.

Answer accepted by question author

Megha Ramakrishnan 325 Reputation points
2026-06-29T03:32:19.1466667+00:00

Hi @Dewa Tsukasa ,

Below is the list of recommended metrics and alerts you could configure for Azure functions and Azure SQL database.

  1. Azure Functions (Consumption Plan, .NET 8.0)

In a Consumption plan, Azure manages CPU and memory allocation by automatically adding or removing instances. Your bottleneck will rarely be an individual instance running out of CPU; instead, it will be the speed of scaling, connection exhaustion, or execution time boundaries.

Metrics to Monitor:

  1. Function Execution Count & Function Execution Units: Consumption plans have a subscription-level limit on concurrent executions across all apps. Monitoring execution units (measured in MB-milliseconds) helps track resource consumption velocity.
  2. Function Execution Time / Duration: The Consumption plan has a strict maximum timeout (default 5 minutes, can be extended to 10). If increased database load causes functions to slow down, they may hit this hard limit and terminate abruptly.
  3. HTTP 429 (Too Many Requests) / HTTP 503 (Service Unavailable): If traffic spikes faster than the Azure Functions scale controller can allocate new workers, Azure will temporarily throttle incoming requests.
  4. Dependency Duration (via Application Insights): Essential for tracking how long your .NET 8 code spends waiting on Azure SQL queries. An increase here directly impacts function concurrency and execution time.

Alerts to Configure:

  1. Function Execution Time (Average/Max): Set a warning alert if the 95th percentile approaches 4 minutes.
  2. HTTP Server Errors (5xx) / Throttling (429): Alert immediately if these metrics spike, as it indicates the scale controller cannot keep pace with incoming volume.
  3. Azure SQL Database (Serverless, General Purpose, 4 vCores)

Serverless databases scale vCores automatically based on workload demand. The main risks here are reaching the maximum configured vCores (4 vCores), I/O limitations inherent to the General Purpose tier, and connection limits.

Metrics to Monitor:

  1. App CPU Percentage (⁠cpu_percent⁠): This metric reflects the percentage of compute utilized relative to the maximum limit configured (your 4 vCores). If this reaches 100%, your database cannot scale any higher, and queries will queue.
  2. Data IO Percentage & Log IO Percentage: The General Purpose tier uses Azure Premium Remote Storage. As traffic scales, you are much more likely to hit I/O throughput limits (IOPS) or transaction log write limits (⁠log_write_percent⁠) before you hit CPU limits. High IO latency will slow down your functions significantly.
  3. Sessions Percent / Workers Percent: Every Azure SQL tier has a hard cap on concurrent worker threads and connections. For a 4 vCore General Purpose database, the worker limit is typically around 180–200 concurrent workers. If your Azure Functions scale out to dozens of instances simultaneously, they can easily exhaust the database worker pool, resulting in connection timeouts.
  4. Blocked by Firewall / Connection Failures: Tracks failed attempts to reach the database, which helps distinguish between application-level timeouts and database-level capacity rejections.

Alerts to Configure:

  1. App CPU Percentage > 80%: Indicates you are consistently hovering near your maximum 4 vCore allocation during peak traffic.
  2. Data/Log IO Percentage > 80%: A leading indicator that storage throughput is bottlenecking your application performance.
  3. Workers Percentage > 80%: Signifies that the database is running out of threads to execute concurrent queries, a common issue when serverless functions scale out aggressively.

Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.

Thank you!

Was this answer helpful?

1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.