Breaking News from AWS re:Invent
Coralogix receives AWS Rising Star award!

Back to All Docs

Service Catalog

Last Updated: Nov. 30, 2023

The Service Catalog offers a centralized, data-rich resource for managing and optimizing the services within your system. It provides a holistic view of service health, enabling better decision-making and faster issue resolution, ultimately improving the performance and reliability of your entire system.

Service Catalog Overview

The Service Catalog provides a full list of services that you have in your system, displaying the health of each service. The catalog displays service type, number of requests received by the service, and error rate and latency for received requests.

  • Using the search bar, search by service or any other parameter.
  • Select the timeframe for which you want to view your services.
  • Use dimensions to filter your services. Dimensions help you filter your services by adding new labels to a metric, which then allows you to filter the services shown according to the tags you define.

Prerequisites

Access the Service Catalog

STEP 1. In your Coralogix toolbar, navigate to APM. Click on the Service Catalog tab.

STEP 2. Select the timeframe for which you want to view information.

STEP 3. Select a service to view the service drill-down.

Filter Services using Dimensions

Creating a dimension involves adding a new label to a metric, allowing you to filter the services shown according using tags you define.

STEP 1. Click { } Add Dimension on the upper right-hand corner of the Service Catalog tab.

STEP 2. Enter a filter name and select a span tag from the dropdown menu to pair the data source with the dimension.

STEP 3. To add additional filters, click { } ADD DIMENSION and repeat STEP 2.

STEP 4. Click ADD DIMENSIONS.

STEP 5. Once you have created one or more dimensions, the Dimensions toolbar will appear above the Service Catalog.

To filter the services using a specific dimension, choose from the dimensions bar at the top and select the results you wish to see.

Notes:

  • When you go into a specific service with dimensions selected, the service drill-down will remain filtered by the selected dimension.

Limitations

  • Dimensions create metrics from spans and are therefore considered part of your quota. Use a maximum of 5 dimensions, each of which can filter up to 10k labels (cardinality).
  • Only team admins have permissions to create dimensions.

Service Drill-Down

The Service Drill-Down displays more detailed information about the specific service selected.

The drill-down includes details of which service you are viewing and all the details given in the main service catalog page. It includes visualizations and additional details which change depending on which tab you are viewing.

The service drill-down includes the following tabs:

  • Overview
  • Requests
  • SLI
  • Resources
  • Logs
  • Internal
  • Dependencies
  • Map

Overview Tab

The Overview tab gives a summary of the service.

The widgets shown in this tab give you a broad overview of the service, for the timeframe selected in the top bar.

The Overview widgets include:

  • SLO. An overview of the current SLOs, how many are okay, how many are breached and how many are not available.
  • Average Latency. Shows the average latency for the current service.
  • Throughput. Shows the throughput for the current service.
  • Error Percentage. Displays the percentage of errors in relation to the total number of requests.
  • Requests and Errors. Shows a graph with the number of requests and errors for the service.
  • Error Percentage for Top 5 Operations. Displays the percentage of errors in relation to the total number of requests for the top 5 operations.
  • Apdex Score. Displays the Apdex (Application Performance Index) score over the selected timeframe. The Apdex score is a standardized metric used to measure and quantify user satisfaction with the response time of software applications. For more information about Apdex including how to define the threshold, view our Apdex Score tutorial.
  • Highest Consumption. Shows the five operations or dependencies with the highest consumption.
  • Latency. Shows a graph with the P99, P75, P50 and Average latency for the service.
  • Map. Shows a mini version of the service map.

Requests Tab

The Requests tab presents different requests received by the service – in the form of server and consumer spans.

At the top of the page three charts are shown. By default, they show the top ten services for each of the following:

  • Time Consumption
  • Throughput
  • Error Rate

For each operation, view operation type, method, time consumed, percentage of errors caused by the operation, and what percentage the operation comprised of the total number of operations. These are all shown for the timeframe and dimensions selected.

View a deeper drill-down of each operation by clicking on an operation row or a series.

The deep drill-down shows the time when the operation occurred, operation type, the service for which the operation was taken, the duration of the operation, and how many errors it generated. It also shows the Throughput, Error Rate and Latency graphs for that specific operation.

SLI Tab

The SLI tab provides a view of the service level indicators (SLI) for the service.

For any service, the SLI presents error percentages, and if that amount is within the acceptable range of errors for the service. If the percentage of errors is higher than allowed, an SLI breach occurs.

Notes:

  • Only team admins can add new SLIs.
  • New SLIs take a minimum of seven days for their computation window to complete.
  • Before it is complete, the SLI will show incomplete data.

To begin tracking SLIs, add a new SLI to the system.

STEP 1. From the SLI tab of a Service Catalog drill-down, or from the main Service Catalog tab, click + ADD NEW SLI.

STEP 2. Select the service or services to apply the SLI to, from the Service dropdown.

STEP 3. Select the SLI type: Error or Latency.

STEP 4. Enter a name and optional description for the SLI.

STEP 5. Select the filters and threshold for your SLI.

STEP 6. Select the SLO percentage and the period for which the SLI is valid. For example, 90% for 7 days means that as long as the error rate over each seven-day period is no higher than 90%, the SLI is valid.

STEP 7. Click ADD NEW.

Resources Tab

The Resources tab presents resources used by the service.

The resources in this tab present CPU utilization, memory used (bytes), and network usage (bytes) for the timeframe selected in the top bar.

Logs Tab

The Logs tab presents all related logs for the selected service.

On the right hand side of the logs tab, click OPEN LOG QUERY to open a new tab with the logs open in your Coralogix Explore Screen.

Set up Correlation Mapping to allow your system to identify the fields in a log that are related to the service. The feature does this by mapping a single key to one or more replacement keys in the service’s logs.

STEP 1. Click Setup Correlation on the right hand side of the logs tab.

STEP 2. Select the replacement logs key from the dropdown menu.

STEP 3. Click UPDATE CORRELATIONS.

Internal Tab

Similar to the Requests tab, the Internal tab shows operations performed. However unlike the Requests tab, the Internal tab shows only those operations which were internal to the service. At the top of the page three charts are shown. By default, they show the top ten services for each of the following:

  • Time Consumption
  • Throughput
  • Error Rate

For each operation, you can view the operation type, method, P95 latency, percentage of total requests, percentage of errors caused by the operation, and the time consumed by the operation. These are all shown for the timeframe selected in the top bar.

You can see a deeper drill-down of each operation by clicking on an operation row.

Dependencies Tab

Similar to the Requests tab, the Dependencies tab shows operations performed. However unlike the Requests tab, the Dependencies tab shows only those operations which the service requested from other services, in the form of producer and client spans.

At the top of the page three charts are shown. By default, they show the top ten services for each of the following:

  • Time Consumption
  • Throughput
  • Error Rate

For each operation, you can view the operation type, method, P95 latency, percentage of total requests, percentage of errors caused by the operation, and the time consumed by the operation. These are all shown for the timeframe selected in the top bar.

You can see a deeper drilldown of each operation by clicking on an operation row.

Map Tab

The Map tab displays the service map centered on the selected service.

Services that send requests to the service are shown to the left of the selected service, and services to which the service sends requests are shown to the right. The latency for each service is listed on the connecting line between the services.

Hovering over a service shows a tooltip with the throughput, error rate, and average duration for that service.

Additionally, right-clicking on a service brings up a context menu with the options to view the Service Overview for that service, view errors for that service, or view related logs for that service.

Additional Resources

DocumentationApplication Performance Monitoring (APM)
Apdex Score

Support

Need help?

Our world-class customer success team is available 24/7 to walk you through your setup and answer any questions that may come up.

Feel free to reach out to us via our in-app chat or by sending us an email at [email protected].

On this page