Daily Data Volume
Yotpo is an E-commerce marketing platform, helping thousands brands accelerate direct-to-consumer growth. The company’s single-platform approach integrates data-driven solutions for reviews, loyalty, SMS marketing, and more, empowering brands to create smarter, higher-converting experiences that spark and sustain customer relationships.
Using Coralogix, the R&D organization at Yotpo monitors more than 3TB of data being produced daily by more than 100 services and is able to ensure that they maintain their 99.99 uptime SLA. The DevOps group used to manage a full in-house ELK stack to centralize their logs, but struggled to maintain the level of reliability needed for their monitoring infrastructure and found themselves making tradeoffs between cost and compliance.
They chose to adopt Coralogix to alleviate the operational overhead of managing ELK in-house, provide a stable monitoring infrastructure, and allow them to focus on optimizations within the systems that are core to their business.
The DevOps group at Yotpo is comprised of three teams responsible for the operations of their cloud production environments and monitoring infrastructure.
Before being introduced to Coralogix, the team was managing a full ELK stack in-house to centralize application and traffic logs from across the organization. The logs were then leveraged by multiple engineering teams to monitor application health and performance, release quality, potential security events, and more.
Due to the volume of data coming in, retention was limited to 14 days. Of course, there are compliance and legal requirements for data to be saved for much longer, and the team struggled to provide a comprehensive solution.
As the volume of log data being ingested grew, there came a tipping point when continuing to manage the solution in-house became overwhelming. Not only was the solution not meeting their compliance requirements, the team suffered from frequent issues from log spikes and mapping issues.
They began to look for a new platform that would alleviate the operational overhead of managing ELK in-house, provide a stable monitoring infrastructure, and allow them to focus on optimizations within the systems that are core to their business.
The implementation process started with a single development team and then expanded across the R&D organization with more than 200 engineers across 10 teams now using the platform. Each team went through individual onboarding sessions with Coralogix’s customer success team to show the possibilities for use cases in the platform and how to best leverage their data.
As part of the onboarding process, our team also worked with Yotpo to optimize their Logstash configuration and proactively block huge amounts of data that do not carry any value before sending to Coralogix. The initial implementation process including data optimization took about a month with the majority of the work occurring in the first two weeks.
Unlike in ELK, Yotpo was able to split the logs to different Coralogix teams. So every group in the organization has its own dedicated account with its own settings, parsing rules, and quota. This allows each group to control their logs and visualization according to their specific needs. In addition, they can easily look at other teams’ logs with the cross-team query functionality.
Yotpo implemented the Coralogix Terraform Provider to manage their parsing rules and configuration at a high level and reduce the need of configuring each team on its own.
We love how easy it is to optimize our data in the Coralogix platform to improve visibility and productivity, plus the support is amazing.
Yotpo’s Head of DevOps, Nethanel Moshkovitz, estimates that Coralogix saves his own team about 2 hours each week which he considers to be a significant amount of time. In addition to removing the maintenance overhead and incident remediation for their monitoring solution, adopting Coralogix has improved developer productivity across Yotpo’s R&D organization.
Yotpo’s platform is built in Kubernetes running on AWS, and they run high-frequency CI/CD pipelines with a few dozen deployments each day. With Jenkins integrated to Coralogix, version tags and benchmarks are automatically created for every code release. All of Yotpo’s log data is ingested to Coralogix and immediately available to the engineers. With dedicated parsing rules for each group, the data is highly optimized and queriable across the organization.
Developers use the Loggregation feature to look at a handful of templates rather than millions of log lines. They can understand much more quickly where to focus their investigation with ratios between the templates, plus occurrence graphs and variable distributions for each template.
Upon ingestion, the TCO Optimizer feature helps to further improve the team’s ROI on their logging data and monitoring infrastructure. Using policies, they designate low-level logs and logs from specific applications as Compliance data.
This data is stored in their S3 bucket with the rest of their data but is never indexed. They can query it directly from the Coralogix without needing to pay to keep it in a hot storage or worry about retention.
Coralogix enables Yotpo to ensure they are compliant with their log storage requirements without needing to index the data saving them more than $400K a year.