How Kotak811 Revolutionized Digital Banking Observability with Coralogix
Kotak811, the digital-first engine of Kotak Mahindra Bank, is a banking platform serving over 23 million users across India. Since its launch in 2017, Kotak811 has transformed into the bank’s primary growth driver, now accounting for 70% of all new customer acquisitions. The platform is widely recognized for offering a paperless, mobile-first experience, providing everything from instant zero-balance accounts to seamless UPI payments and investment tools.
As a high-growth platform, Kotak811 regularly manages millions of concurrent transactions, leading to a massive surge in telemetry data. Relying on traditional monitoring tools during such rapid expansion created a “data drowning” effect, where spotting critical issues amidst the noise became nearly impossible. Identifying the root cause of a failed onboarding or a slow payment was often a slow, manual process that risked impacting the user experience.
In this blog, we explain how partnership with Coralogix revolutionized the observability strategy of Kotak811. By moving to a modern, full-stack approach, they successfully eliminated technical bottlenecks and gained real-time insights.
Technical Implementation
The following technical implementation highlights the data ingestion and archival architecture for the Kotak811 observability platform:

- Kubernetes and Compute Ingestion: Amazon EKS and EC2 instances utilize the Opentelemetry (OTel) framework—deployed via Helm charts—to standardize and stream telemetry data.
- Serverless Log Shipping: AWS native logs from CloudWatch and CloudTrail are delivered to an Amazon S3 bucket within the VPC. The coralogix-aws-shipper Lambda function then triggers to ingest these logs securely.
- Secure Private Connectivity: All telemetry signals are transmitted through an AWS Private Link endpoint, ensuring that sensitive financial data remains within the AWS network and never traverses the public internet.
- Advanced Processing and Archival: Once in Coralogix, data is processed by the Streama© engine for real-time alerting and enriched with metadata.
- Bidirectional Data Flow: Metadata-enriched data is sent back to a dedicated Amazon S3 bucket for cost-effective, long-term archival while remaining fully searchable via DataPrime.
The 5-Phase Implementation, Evolution and Adoption
Transforming Kotak811’s operational hurdles into a cohesive observability model demanded a clear strategy. To transition from early exploration to full-scale integration and a modernized technical perspective, the Coralogix TAM and Kotak811 DevOps teams navigated five distinct phases of implementation.:
Phase 1: Discovery & Architectural Mapping
The journey began with a comprehensive audit of the Kotak811 landscape. Coralogix TAM collaborated with internal stakeholders to map complex service dependencies, identifying critical “blind spots” in the legacy stack and defining a roadmap for full-stack visibility across the production environment.
Phase 2: Assisted Onboarding & “Observability-as-Code”
Transitioning to the platform was treated as a co-engineering exercise rather than a simple migration. By integrating Coralogix into existing CI/CD pipelines via Terraform and Helm, the team automated agent deployment, ensuring that every new microservice was “born observable” without manual intervention.
Phase 3: Telemetry Unification & Contextual Correlation
With data flowing, the focus shifted to breaking down silos. Using Opentelemetry (OTel) standards, Kotak team linked distributed traces directly to high-cardinality logs and infrastructure metrics. This ensured that an anomaly in a backend pod could be traced back to the specific user transaction in seconds.
Phase 4: Operational Hardening & TCO Optimization
To manage the scale of 23 million users, we implemented Coralogix Streama© technology. By applying priority-based routing, Kotak811 optimized their Total Cost of Ownership (TCO)—retaining 100% of telemetry for real-time alerting while offloading “noisy” debug data to low-cost, but still searchable archives.
Phase 5: Full-Stack Adoption & Cultural Shift
The final phase empowered application teams through deep-dive enablement sessions. By adopting features like APM, Cases, Infrastructure Explorer, Custom Dashboards etc, developers gained autonomous troubleshooting capabilities, shifting the culture toward distributed ownership and “zero-friction” debugging.
Challenges to Benefits
The transition to Coralogix was more than just a tool swap; it was a fundamental shift in how Kotak811 transformed operational complexity into actionable insights. To maintain a “digital-first” reputation, the engineering team needed to move away from reactive troubleshooting and toward a proactive, unified observability culture. By breaking down data silos and automating visibility, they converted their growing pains into a scalable framework for innovation.
The following points detail the specific technical challenges faced and the strategic benefits gained through this implementation:
1. From “Data Overload” to “Streama-Powered Real-Time Insights”
Challenge: Rapid growth in digital transactions led to a massive surge in telemetry volumes, making it nearly impossible for engineers to find “the needle in the haystack” during outages.
Benefit in Coralogix: By implementing Streama©, Kotak811 turned raw data into a strategic asset. They now monitor 100% of their telemetry while only paying for high-priority insights, ensuring that critical banking errors are surfaced instantly without the noise.
2. From “Tool Silos” to “Unified Correlation”
Challenge: Engineering teams were forced to context-switch between disparate tools for logs, metrics, and traces, leading to a fragmented view of the customer journey and high MTTR.
Benefit in Coralogix: Coralogix provided a Single Pane of Glass. By correlating APM traces directly with infrastructure metrics and logs, Kotak811 turned “blind spots” into a transparent delivery chain, allowing teams to resolve complex microservices issues in minutes rather than hours.
3. From “Reactive Firefighting” to “Predictive Guardrails”
Challenge: The DevOps team was often the last to know when a service was degrading, relying on customer complaints or manual threshold alerts that triggered too late.
Benefit in Coralogix: Utilizing Infrastructure Explorer and Stateful Alerting, the team moved to a proactive stance. They now identify “hot” nodes and anomalous patterns before they impact the end-user, effectively turning potential downtime into non-events.
4. From “Manual Onboarding” to “Observability-as-Code”
Challenge: The sheer speed of feature releases at Kotak811 made it difficult for the central DevOps team to ensure every new microservice was properly instrumented and monitored.
Benefit in Coralogix: The Coralogix TAM, acting as an Extended Engineering Pod, helped automate the entire onboarding journey. This turned a manual bottleneck into an automated CI/CD pipeline integration, where every new piece of code is “born observable” by default.
5. From “Black Box Backend” to “Developer Empowerment”
Challenge: Application developers lacked deep visibility into how their code performed in production, leading to a heavy reliance on the SRE team for basic troubleshooting. The
Benefit in Coralogix: With Distributed Tracing, APM and Custom Dashboards, developers gained “Full-Stack Ownership”. Application teams now use APM features to independently optimize code performance and database queries, accelerating the overall release velocity of the Kotak811 app.
The Power of Unified Observability: Results & Impact
The strategic synergy between Kotak811 and Coralogix has fundamentally reshaped the bank’s operational DNA, moving from a fragmented, reactive state to a unified, proactive powerhouse. By consolidating logs, metrics, and traces from over 200 microservices into a “Single Pane of Glass,” the engineering team achieved a 90%-95% reduction in Mean Time to Detection (MTTD). This streamlined observability model also delivered a 30% reduction in total observability cost via intelligent data tiering and Streama technology which Kotak will further optimize with the help of Coralogix TAM.
Beyond incident response for its ~23 million users, the partnership significantly accelerated the development lifecycle. What was once a manual onboarding bottleneck was transformed into an automated CI/CD pipeline integration, ensuring new applications are integrated seamlessly and are “born observable”. This provides real-time visibility into critical transaction flows—from account creation to UPI payments—without the overhead of managing complex infrastructure. Ultimately, this evolution liberates engineers from managing massive data volumes, allowing them to focus on perfecting the customer experience and driving the 70% of new account acquisitions that define Kotak811’s market leadership.
About the Authors
| Ravi P. Srivastav SVP, DevOps Engineering · Kotak 811 | Ravi P. Srivasatav is a Visionary leader of Platform Engineering with 20+ years of expertise in Internal Developer Platforms (IDPs), Microservices, DevSecOps and Full Stack Observability, ensuring 99.99% availability for high-scale, revenue-critical systems. Leading the frontier of GenAI and Agentic AI, Ravi translates platform innovation into strategic business growth—driving DORA excellence and securing high-performance microservices at scale. |
Chandan Maheshwari VP, DevOps and SRE, Kotak 811 | Chandan Maheshwari is a DevOps and SRE leader at kotak811, specializing in hyperscale containerized and on-premises Infrastructure. He excels at driving reliability through automation, observability-centric operations, and incident mitigation. By leveraging robust monitoring, resilient architecture and SLO-based frameworks, Chandan ensures seamless infrastructure performance and high-availability systems. |
Shubham Sharan Technical Account Manager, Coralogix | Shubham Sharan is a Strategic Advisor, specialized in multi-cloud orchestration and hyperscale observability for BFSI. Leading Agentic AI for autonomous operational resilience, he designs and engineers mission-critical frameworks that guarantee 99.99% availability. Shubham excels at synthesizing high-level technical blueprints with strategic business objectives, driving TCO optimization and reliability for revenue-critical cloud/on-prem ecosystems. |