[Workshop Alert] Dynamic Scoring for WAF Actions and CloudFront Traffic - Save Your Seat Now!

From Elasticsearch to Coralogix: BharatPe’s “Buy vs Build” Case

  • 5 min read
case study
16
million transactions/day
500
microservices on AWS
150
active engineering users
40TB+
of daily data ingestion with Coralogix

About BharatPe 

BharatPe is a leading Indian fintech company focused on empowering small and medium businesses across the country. It has a registered network of over 13 million merchants across 450+ cities, the company is one of the leading players in UPI offline transactions, processing 370 million+ UPI transactions per month. BharatPe processes payments of annualized Transaction Processed Value of over Rs. 1.7 Lac Crores. BharatPe offers a range of payment products including a unified QR code compatible with various digital wallets and banking apps, a range of POS devices as well as sound boxes. BharatPe also facilitates credit access to offline merchants, supporting millions of merchants and enhancing India’s digital financial landscape.

The Challenge

Before adopting Coralogix, BharatPe relied on an in-house Elasticsearch setup to manage its observability needs. The company’s rapid growth, handling 16 million UPI transactions monthly and serving over 13 million merchants, exposed significant limitations in its system. As the company’s log volumes increased, its Elasticsearch setup struggled to keep up, leading to several critical issues:

Dedicated Team Requirement: Managing the log clusters demanded a dedicated team, which was not feasible for a fast-growing startup with a lean DevOps approach.

Inefficient Incident Investigation: Developers struggled to find relevant insights during infrastructure incidents. The process of checking logs and services was time-consuming, often resulting in timed-out queries and partial data availability.

Root Cause Analysis: The time taken to investigate incidents and pinpoint root causes was excessive, making it difficult to maintain smooth operations.

BharatPe needed a more scalable, efficient solution that could handle the company’s growing log volumes without requiring substantial engineering overhead. The decision to migrate from an in-house solution to a robust, off-the-shelf platform like Coralogix became clear, especially given the extra cost and effort involved in expanding the company’s existing Elasticsearch clusters on the AWS cloud. BharatPe’s journey towards better Observability began when it was introduced to Coralogix by tech consultancy Onnivation.

The Solution

Migrating to Coralogix was a straightforward process for BharatPe. Utilizing Fluent Bit and Helm Charts, they simply updated the endpoints to Coralogix, completing the migration with minimal deployment work. This seamless transition enabled BharatPe to leverage Coralogix’s advanced features without delay.

Coralogix offers a comprehensive observability platform for BharatPe that includes:

Log Management: Efficiently handles large volumes of logs with advanced querying capabilities.

Metrics and Traces: Provides detailed insights into system performance and application behavior.

TCO Optimizer: Monitors more data at lower costs by prioritizing and indexing only high-priority logs, while querying medium and low-priority logs from an S3 archive as needed.

Kubernetes Dashboard: Facilitates easy monitoring of Kubernetes environments. This is very useful, given that all of BharatPe’s deployments are on Kubernetes.

Consolidated Incident View: Offers a unified view of incidents, simplifying root cause analysis and issue resolution.

Currently, BharatPe ingests 40TB+ of logs, metrics, and traces into Coralogix every day. The platform’s ability to handle such high volumes of data without compromising on performance has been a game-changer for BharatPe’s growth story.

Be it migrating from Elasticsearch or monitoring large volumes of data every day, Coralogix has made observability an effortless process for the team

Ravi Ranjan Kumar, Senior Engineering Manager, BharatPe

Results and Benefits

TCO Optimization

One of BharatPe’s favorite features on Coralogix is the TCO Optimization, which reduces the observability costs while maintaining full visibility into the company’s telemetry data. Coralogix’s unique architecture eliminates the need for indexing or hot storage by performing data analysis in-stream, allowing real-time alerts and rapid querying. BharatPe leverages Coralogix’s unique architecture and flexible TCO setup to manage data ingestion more efficiently. By assigning priority levels, they index only high-priority logs in hot storage for frequent searches. Medium-priority logs (for alerts and dashboards) and low-priority logs (compliance data) are queried from an S3 archive as needed, without indexing. This approach allows BharatPe to monitor more data at a lower cost.

Consolidated Incident View

Coralogix’s consolidated view of incidents has significantly improved BharatPe’s incident management process. This feature allows the company to map and identify affected applications or services quickly, facilitating faster resolution of issues. The ability to see a snapshot view of incidents enables BharatPe to address problems proactively, reducing downtime and improving overall system reliability.

Kubernetes Dashboard

Coralogix’s Kubernetes Dashboard has been instrumental in enhancing BharatPe’s observability capabilities. With over 500 microservices running on AWS, the ability to monitor the Kubernetes environments efficiently is crucial. This feature provides detailed insights into the performance of the company’s Kubernetes clusters, helping the company maintain optimal operations and quickly address any issues that arise.

Logging and APM IntegrationBharatPe is currently migrating its Application Performance Monitoring (APM) to Coralogix, aiming to have a single platform for both logs and metrics. This integration allows the company to act on incidents faster and debug issues more efficiently. Having both logging and APM on the same platform enables seamless transitions between logs and metrics, facilitating quicker correlation and resolution of issues.

“Coralogix’s unique features like TCO, effortless migration, and integration of logs-metrics capabilities have been game-changers for BharatPe. Coralogix has enabled our lean team to scale efficiently, cut down costs, and focus on growth without the constraints of managing an in-house observability setup.”

Pankaj Goel, CTO, BharatPe

Summary

BharatPe’s decision to migrate from an in-house Elasticsearch setup to Coralogix exemplifies the benefits of choosing a scalable, comprehensive observability solution over building and maintaining an in-house system. With Coralogix, BharatPe has streamlined its observability processes, reduced the time and resources required for incident management, and optimized costs. This transition has empowered BharatPe to focus on scaling its fintech business and serving its merchants and customers more efficiently, illustrating the tangible benefits of adopting a robust observability platform like Coralogix.