What is Istio Service Mesh, and Do I Need It?

Joanna Wallace
October 18, 2022

Development teams build modern applications using microservice architectures. Individual services are built and maintained by separate teams, and then these services are combined using container-based orchestrators to comprise a complete product offering. Microservices are a standard development method because they allow teams to iterate releases, providing ongoing new customer-facing features and bug fixes without needing to redeploy an entire platform or app.

Kubernetes is an open-source tool used to manage services that make up a platform. Services are deployed on containers that Kubernetes (K8s) can manage or orchestrate. K8s can detect problems in the service and autoscale to meet the application’s demands. Kubernetes includes some built-in observability features, but many more are available when Kubernetes uses Prometheus or Istio. We will discuss the purpose of the Istio service mesh here to show why Kubernetes monitoring deployments should include it.

What is a Service Mesh?

A single application could contain hundreds of services and require tens of thousands of instances. Instances tend to change dynamically. Kubernetes manages instances and services, but they are still isolated from each other unless a complex service-to-service communication system is implemented. This communication is critical to managing the performance of any application.

Services can be built to communicate directly, but direct communication leaves it up to developers to implement manually. As applications grow, so can the complexity of communications. A service mesh can replace hard-coded interactions, allowing microservices to communicate directly without adding logic to the microservices.

What are the Features of a Service Mesh?

Security

Service meshes can encrypt communications automatically before sending data across services, minimizing the risk of the data being intercepted by nefarious actors. Security policies, like authorization and authentication, can also be distributed through the mesh. Managing such security policies centrally rather than implementing them in each service helps reduce complex connection requirements in microservice platforms.

Observability

Service Meshes like Istio give insight into service health and behavior. Istio proxy sidecars gather and aggregate metrics, logs, and distributed trace data from service interactions and can export data to third-party tools for analysis and visualization. Observability data helps teams understand why services go down and indicate where fixes can be applied to reduce the overall downtime of platforms.

Reliability

Istio improves the reliability of platforms by providing some critical features inherited by its distribution in sidecar proxies. Istio can load balance available resources and route traffic onto the fastest available path. This ensures that no single resource is overwhelmed with traffic causing an outage. Istio also offers circuit-breaking capabilities, so if a service is down, Istio can detect that and pause communications until the service has recovered. The delayed requests can be handled or dropped when normal operations continue.

How do Service Meshes Work?

When running a platform that uses microservices, many different applications will be running that need to communicate with each other. This communication needs to be secured for sending to other applications, and it is also helpful to track communications for analysis later.

A service mesh provides observability, security, and dependability capabilities to applications at the platform layer. Typically, the service mesh is built as a scalable set of network proxies deployed alongside application code as a sidecar. These proxies handle communication between microservices and can also be used to communicate to a mesh.

What is Istio Service Mesh, and how is it unique?

Istio is an open-source service mesh developed originally by teams at Google, IBM, and Lyft. Its original purpose specialized for applications with highly complex architectures that operate at high volumes. Now, Istio is used by large and small applications alike to provide observability and security features. Originally Istio was developed, focusing on Kubernetes support while being platform agnostic. It is also independent of any programming language used to build services.

The Istio service mesh is attached to each software service through a fixed piece of software running as a sidecar proxy. When services communicate, they send messages through the proxy running Istio. The proxy will provide authorization and authentication of messages and provide encryption and decryption for further security. Since Istio works on a network layer, it is positioned well to provide traffic management, application monitoring, and observability by adding trace data to messages.

Istio and Observability

The industry’s shift to favor microservices over monoliths has many benefits, but microservices are not as easily monitored as monoliths. Microservices are challenging to troubleshoot because the information is sprawled across different services, linked only by the messages sent between them. Service meshes like Istio can bolster your platform’s observability by injecting pertinent data into these messages so DevOps teams can track data as it flows across services. Istio provides metrics, traces, and logs that can assess the health of the service encapsulated by the Istio service mesh.

Metrics

Istio generates metrics to give insight into the service’s and the mesh’s behaviors. Some metrics can be automatically exported to Prometheus for use in a complete observability platform. Metrics are separated into three categories:

Proxy-level metrics are generated inside each sidecar proxy, providing traffic details and metrics about the sidecar itself.
Service-level metrics detail service communications and cover latency, traffic, saturation, and errors.
Control plane metrics provide self-monitoring information for Istio.

Traces

Traces help monitor data flow by appending information to messages that can later be visualized. Traces are used to track data as it flows through a platform to show bottlenecks, blockages, and outages in a platform.

Istio can help monitor traffic through a platform by appending distributed trace data to messages flowing through the mesh. Trace spans are automatically generated in the sidecar, requiring only some contextual data to be sent with the messages. Istio supports tracing backends like Jaegar.

Access logs

Access logs can be accessed to show the behavior of each workload instance. These logs show service traffic through workload instances for easier troubleshooting when a specific instance needs to be inspected. The log output is configurable so that users can control the log format and how much or little is logged.

…The Catch

All the features Istio provides are critical for a stable and scalable platform. Istio was hailed early for promising all these features, with many developers jumping on board early to implement it. However, Istio had to become quite complex to provide all these features. It is not trivial to set up Istio, and it is tough to ensure it will run properly in production. Istio is open-source but will require cost in terms of architects to set up this tool.

Summary

Service meshes like Istio enhance distributed platforms’ security, reliability, and observability. Istio service mesh is a communications hub with routing, load balancing, and circuit-breaking capabilities. Istio can inject data with distributed traces and collect metrics and logs from data as it passes through the mesh. Security protocols can be centralized in the service mesh to simplify implementation.

Istio offers a rich feature set that enhances control over platforms, so they are more secure, can notify DevOps teams about outages, and can even recover from them. Istio is still a relatively young open-source project but has become well-supported in the open-source community. Pairing Istio’s captured data with external tools like Coralogix’s full-stack observability platform can ensure you have the insights needed to focus on features instead of maintenance on your project.