Welcome to Continuous Profiling

Applications often behave unpredictably in production due to factors like unexpected user requests, configuration changes, security constraints, traffic spikes, and edge cases. These variations can cause excessive CPU and memory usage, increasing costs and degrading user experience. While external metrics such as CPU load, memory consumption, request volume, and latency provide some visibility, they do not reveal what is happening inside the code.

Continuous Profiling bridges this gap by offering real-time insights into code execution, helping identify resource-heavy and time-consuming operations. By continuously analyzing performance in any environment—especially in production, where real-world conditions are challenging to replicate—profiling enables developers to detect inefficiencies, optimize code, reduce infrastructure expenses, and enhance application performance.

What is Continuous Profiling?

Continuous Profiling is a method for analyzing how software behaves over time by collecting data on its execution. This includes measuring function execution times, memory and CPU usage, and other system resource consumption along with relevant metadata.

Leveraging OpenTelemetry’s profile signal, Continuous Profiling enables teams to correlate resource inefficiencies or performance issues beyond just a specific service or pod, offering granular insights into application behavior.

Why use it?

Traditional profiling tools are often used during development to optimize performance, but they come with challenges that make them impractical in production:

High resource consumption and performance impact due to instrumentation
The need for service restarts, causing disruptions
Limited visibility into third-party libraries

In contrast, Continuous Profiling runs in the background with minimal overhead, delivering real-time insights without requiring issue replication in test environments. This helps SREs, DevOps teams, and developers better understand how code impacts performance and infrastructure costs.

With Continuous Profiling, teams can:

Capture code-level performance data automatically, without manual instrumentation
Detect inefficient code paths by storing, querying, and analyzing profiling data over time
Pinpoint performance bottlenecks down to the exact function or line of code causing inefficiencies
Gain deep visibility into application runtime behavior

Continuous Profiling with Coralogix

Coralogix Continuous Profiling agent is a lightweight, always-on tool that delivers deep visibility into code execution with minimal system overhead. It identifies which functions consume the most CPU time, enabling developers to enhance performance and lower infrastructure costs.

The feature is integrated into Coralogix’s Application Performance Monitoring (APM), complementing Infrastructure Explorer, Distributed Tracing, and Real User Monitoring, among other features.

Supported runtimes and languages

Coralogix Continuous Profiling works across various programming languages, including:

C/C++
Rust
Zig
Go
Java
Python
Ruby
PHP
Node.js / V8
Perl
.NET

By providing continuous, low-overhead profiling, this approach enables teams to optimize performance proactively without disrupting services.

Beyond tracing

The key distinction between tracing and profiling is that tracing identifies which service-level requests are slow, while profiling reveals why they are slow at the code level. Tracing helps track request latency across services, showing how requests flow through different components. Profiling goes deeper, providing visibility into resource usage at the code level.

Tracing captures when a method starts and stops, offering insights into request execution time. However, it doesn’t measure how much CPU or memory a request consumes. This is where continuous profiling adds value—it periodically samples system resource usage at runtime, capturing 20 stack traces per second (20 Hz) and reporting collected data every 5 seconds. This allows teams to analyze performance continuously without replicating issues in test environments.

Example: Debugging slow responses

Imagine a service called book-catalog-api, responsible for retrieving book details from a database. Users report slow response times when searching for books or fetching metadata. To optimize performance, the development team uses Coralogix’s APM and Distributed Tracing to investigate.

Tracing might reveal that a request takes longer than expected due to multiple database queries. However, if a trace shows a delay without obvious bottlenecks—such as missing spans or unexplained latency—profiling provides deeper insights. By analyzing stack traces and resource usage, continuous profiling can uncover issues like CPU-intensive operations, memory inefficiencies, or inefficient loops in the code.

By combining tracing and profiling, teams gain both high-level visibility into request flows and granular insights into resource consumption, enabling more effective performance optimizations.
APM Continuous Profiling
Scope of visibility Tracks service-level interactions and third-party API calls Provides in-depth analysis of all code execution, including methods
Measurement focus Monitors request patterns, error rates, and latency Evaluates CPU resource usage
Visualization (Flame/icicle graphs) Displays time spent on execution paths across multiple services, highlighting latency and errors Shows a detailed breakdown of resource consumption per minute, categorized by method

Visualizing profiles to monitor CPU consumption

Within the APM Service Catalog drill-down, users can seamlessly view the profiles for a particular service. The Profiles UI provides a detailed view of resource consumption, highlighting which methods contribute most to performance issues. Different profile types are available depending on the runtime and programming language. Initially, the focus is on CPU time, which represents the time spent executing CPU-intensive tasks.

Resources

This guide provides a comprehensive walkthrough of Continuous Profiling.

Here's what you'll find:

New Error and Critical Logs Anomaly

Next Setup & Installation

	APM	Continuous Profiling
Scope of visibility	Tracks service-level interactions and third-party API calls	Provides in-depth analysis of all code execution, including methods
Measurement focus	Monitors request patterns, error rates, and latency	Evaluates CPU resource usage
Visualization (Flame/icicle graphs)	Displays time spent on execution paths across multiple services, highlighting latency and errors	Shows a detailed breakdown of resource consumption per minute, categorized by method