Service Catalog

Service Catalog offers a centralized, data-rich resource for managing and optimizing your system's services. It provides a holistic view of service health, enabling precise decision-making and faster issue resolution, ultimately improving the performance and reliability of your entire system.

Overview

Service Catalog provides a complete list of services you have in your system, displaying the health status of each service. The catalog shows service type, number of requests received by the service, and error rate and latency for received requests.

Use the search bar, search by service, or any other parameter.
Select the timeframe for which you want to view your services.
Use dimensions to filter your services. Dimensions help you filter your services by adding new labels to a metric, allowing you to filter the services shown according to the tags you define.
Toggle between the Health view and Profiles view to identify service health and gain profiling insights.

Prerequisites

Coralogix Application Performance Monitoring (APM) installed and configured.

Access Service Catalog

In your Coralogix toolbar, navigate to APM. Select the Service Catalog tab.
Select the timeframe for which you want to view information.
Select a service to view the service drilldown.

Filter services using dimensions

Creating a dimension involves adding a new label to a metric, allowing you to filter the services shown using the tags you define.

Select Edit on the upper toolbar of the Service Catalog tab.
Enter a filter name and select a span tag from the dropdown menu to pair the data source with the dimension, as shown below.
To add additional filters, select Add Dimension and repeat Step 2.
Select Save.
Once you create one or more dimensions, the dimensions bar appears above the Service Catalog. To filter the services using a specific dimension, select one from the dimensions bar at the top.

Note

When you enter a specific service with dimensions selected, the service drilldown remains filtered by the desired dimension.

Limitations

Dimensions create metrics from spans and are therefore considered part of your quota. Use a maximum of 5 dimensions, each of which can filter up to 10k labels (cardinality).
Only team admins have permission to create dimensions.

Service drilldown

The Service drilldown displays more detailed information about the specific service selected.

The drilldown view displays the specific service you’re examining, along with all the information shown on the main service catalog page. It also provides visualizations and additional context that update dynamically based on the tab you’re viewing.

The service drilldown includes the following tabs:

Overview

The Overview tab gives a summary of the service.

Overview widgets

The widgets in this tab give you a broad overview of the service for the timeframe selected in the top bar. The Overview widgets include:

SLO: An overview of the current SLOs, how many are okay, how many are breached, and how many are not available.
Incidents: Displays triggered alert events within the service.
Average latency: Shows the average latency for the current service.
Requests per minute: Shows average requests per minute for the selected time frame.
Error rate: Displays the percentage of errors in relation to the total number of requests.
Transactions: Shows a graph with a number of successes and errors for the transactions.
Errors: Shows a graph with a number of errors in the top five transactions or versions.
Apdex score: Displays the Apdex (Application Performance Index) score over the selected timeframe. The Apdex score is a standardized metric used to measure and quantify user satisfaction with the response time of software applications.
Top 5 time consuming transactions: Shows the five transactions with the highest consumption over time or in total.
Latency: Shows a graph with the service’s P99, P75, P50, and average latency.

Note
Latency percentiles are calculated using the histogram_quantile() function, which is commonly utilized in systems like Prometheus to compute quantiles (e.g., the 95th percentile) from histogram data. In Coralogix APM, with Event2Metrics, a predefined set of buckets (all in microseconds) is used for this calculation. These buckets include: 1, 2.5, 5, 7.5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10,000, 25,000, 50,000, 75,000, 100,000, 250,000, 500,000, 750,000, 1,000,000, 2,500,000, 5,000,000, 7,500,000, and 10,000,000.
Map: Shows a mini version of the service map.

Manage widgets

To manage a widget, select the ellipsis. You can:

View Query to see the queries underlying it.
Create Alert to monitor service performance and notify you when there are changes.

Group metrics per service version

You can monitor your service health by displaying metrics for each version of your service. Use this data to track changes resulting from version updates or multiple service versions running in parallel. Visualize the changes in the Coralogix UI, and continue with further investigations, such as displaying related traces, etc. Version data is available for the Transactions, Errors, Apdex, and Latency widgets. For details, see the Group by Service Version documentation.

Transactions

The Transactions tab allows to rapidly investigate the radius of the impact of different services in your system over time.

Use it to:

Investigate the performance of each transaction by breaking it down into its constituent operations.
Gain a granular understanding of how each segment, a collection of related operations, affects the performance of the entire transaction over time.
Rapidly identify and troubleshoot the segments causing performance issues over time.

Operations

The Operations tab presents incoming, outgoing, and internal requests for your service. Select which request type you would like to view in the dropdown menu in the upper right-hand corner.

At the top of the page, three charts are shown displaying the service operations for each of the following:

Time Consumption
Throughput
Error Rate

Incoming requests

View the service's requests–in the form of server and consumer spans.

For each operation, view the operation type, method, time consumed, percentage of errors caused by the operation, and the percentage of the operation that comprised the total number of operations. These are all shown for the timeframe and dimensions selected.

View a deeper drilldown of each operation by clicking on an operation row or a series.

The deep drilldown shows the time when the operation occurred, the operation type, the service for which the operation was taken, the duration of the operation, and how many errors it generated. It also shows the Throughput, Error Rate, and Latency graphs for that specific operation.

Outgoing requests

View operations that the service requested from other services, in the form of client and producer spans.

For each operation, you can view the operation type, method, P95 latency, percentage of total requests, percentage of errors caused by the operation, and the time consumed. These are all shown for the timeframe selected in the top bar.

You can see a deeper drilldown of each operation by clicking on an operation row.

Internal requests

View operations internal to the service with internal spans.

For each internal operation, you can view the operation type, method, P95 latency, percentage of total requests, percentage of errors caused by the operation, and the time consumed by the operation. These are all shown for the timeframe selected in the top bar.

You can see a deeper drilldown of each operation by clicking on an operation row.

Dependencies

Dependencies allow to monitor and analyzes how your instrumented services interact with databases, external APIs, third-party libraries, and other microservices. By mapping these relationships, you gain actionable insights into how each dependency affects your application's performance.

API Errors

API Error Tracking simplifies debugging of backend services by assembling thousands of similar API errors into a single group.

Resources

The Resources tab provides a unified, searchable inventory of the service’s Kubernetes and cloud resources, grouped by type by default. It reflects the layout and experience as the Infrastructure Explorer main view. You can view and filter infrastructure resources such as Deployments, Pods, Replica Sets, and Services in context, along with metadata like cloud, account, and environment. From here, select Go to Infrastructure Explorer to open the full infrastructure view for deeper analysis.

Logs

The Logs tab presents all related logs for the selected service. The following actions are enabled:

Open a new tab with the logs in your Coralogix Explore Screen by selecting Open Log Query.
Set up Correlation Mapping to allow your system to identify the fields in a log that are related to the service. The feature does this by mapping a single key to one or more replacement keys in the service’s logs.
Navigate to Setup Correlation on the right-hand side of the Logs tab.
Select the replacement logs key from the dropdown menu.

You can see a deeper drilldown of each operation by clicking on an operation row.

SLO

A Service Level Objective (SLO) is a measurable target that defines the acceptable performance or reliability of a service, ensuring it meets user expectations. By tracking key metrics such as error rates and latency against predefined thresholds, SLOs enable teams to maintain high service quality while optimizing resource usage.

The SLOs view in Coralogix UI offers a comprehensive overview of each SLO's status, target, and remaining error budget, displayed through visual indicators and detailed metrics. This enables teams to proactively address potential issues, prioritize engineering efforts, and ensure alignment between reliability and business objectives.

Whether for incident management, capacity planning, or enhancing user experience, SLOs are a crucial tool for optimizing service health and reliability in Coralogix APM.

Map

The Map tab displays the service map centered on a selected service:

Services that send requests to the selected service are shown on the left.
Services that receive requests from the select service are displayed on the right.

Latency between services is shown on the connecting lines, with line thickness indicating severity—the thicker the line, the higher the latency. If multiple services have an error rate above 0%, the one with the highest error rate is highlighted with a red outline. Hovering over a service displays a tooltip showing its throughput, error rate, and average duration relative to the central service. Selecting a service opens a context menu with options to view the Service Overview, its errors, traces, or related logs.

Note

The Map view adjusts to screen size. On large screens, each service shows its throughput, error rate, and SLO status beside the service name (all relative to the central service except SLO status, which is per-service). On smaller screens, only the service name appears, and additional details are moved into a tooltip.

Viewing traces from the Map

Selecting a service in the Map and choosing View Traces retrieves spans based on the relationship between the selected service and the currently focused service.

Assume checkoutservice is currently in focus in APM. Selecting a dependency service and choosing View Traces from the action menu retrieves the relevant spans by querying archived data using Coralogix’s relational query capabilities.

In practice, span selection follows the rules presented on the example below.

This approach enables consistent and accurate trace exploration directly from the dependency map, including historical context when available.

Profiles

Profiles allow to view and explore services even if they do not emit spans or metrics. Profiling data alone is enough to enable a service profile, ensuring that every service in your environment is visible and observable.

Service retention

Coralogix presents services in the Service Catalog based on their metrics. If we do not receive metrics for a service for 30 days, the service is removed from the Catalog. This practice ensures the Catalog remains current and uncluttered, simplifying user navigation and minimizing unnecessary resource usage. To adjust the 30-day retention period, use our Service Retention gRPC API. Should we receive metrics for a deprecated service after its removal, it will reappear in the Service Catalog and its historical metrics become available.

Additional resources


Documentation	Application Performance Monitoring (APM) Apdex Score Service Retention gRPC API

Support

Need help?

Our world-class customer success team is available 24/7 to walk you through your setup and answer any questions that may come up.

Feel free to reach out to us via our in-app chat or by sending us an email at [email protected].

Need help? Contact Support.

What's new? Find out here.

LLM? Read llms.txt.

Previous Aligning Coralogix and OTel Naming Conventions

Next Alert-Based Service Health