[Live Webinar] Next-Level O11y: Why Every DevOps Team Needs a RUM Strategy Register today!

taylor stich lifestyle

Case Study

How Taylor Stitch reduced production errors by 100x

97%

Decrease in avg. daily errors

24X

Faster resolution times

10%

Dev time saved monthly

taylorstitch.com

About the company

Over the past 10 years, Taylor Stitch has grown from a small startup in San Francisco to a fashion brand adored by people looking for well-fitted, functional, high-quality men’s clothing that is produced with a deep commitment to environmental responsibility.

The no-nonsense style of the clothing line is produced with sustainable materials ranging from organic fibers like merino to recycled and regenerative fabrics made from plastic bottles and upcycled cast-offs.

In 2018 alone, the company saved over 35 million gallons of water with its sustainable choices in fabrics. As the company continues on its growth trajectory, it’s exploring more ways to connect, collaborate and delight customers – and it’s chosen Coralogix to help their engineering team achieve its mission.

Taylor Stitch Website

Overview

As a growing online retail presence, the Taylor Stitch website is constantly increasing in complexity. When it became clear that the customer experience was suffering due to increasing errors between various services, the Taylor Stitch engineering team picked Coralogix to enhance their CI/CD pipeline and immediately began improving results. After only 2 months, the team was able to shrink their average daily errors by 97%.

The Challenge

The founding Taylor Stitch vision had paid off with a growing base of dedicated customers that expect an ever-delightful online retail experience. Maintaining existing functionality as well as adding to the experience was becoming increasingly difficult and further exacerbated by the lack of a comprehensive view of production issues.

Like many other online retailers, Taylor Stitch runs on the Shopify e-commerce platform which is supported by an active 3rd party marketplace of apps that extend Shopify’s native abilities.

Rick and his team have come to rely on a number of Shopify apps in order to manage the full experience from inventory, fulfillment, customer experience, marketing, and business analytics.

As more apps were added to the mix and customized, the company faced increasing levels of errors with the dev team spending a lot of their time debugging and fixing issues rather than building new value. “We needed to constantly create custom apps to patch conflicts between existing third-party apps,” Rick says.

The team knew they were dealing with a list of known issues but were also concerned about unknown problems that crept up unexpectedly. The volume of mounting issues made it close to impossible to manually investigate one-by-one, or know how to prioritize the growing list of issues. Using older logging technology required the team to know exactly where the problems were rooted before extracting any insights and didn’t go far enough in improving the workflow.

Taylor Stitch concluded that they didn’t have the bandwidth to support the dynamic nature of the business while also maintaining a growing mix of apps and services that needed to work seamlessly together. Rick recalls “we were in an almost constant fire-drill mode.”

Meanwhile, the issues were mounting as the various services caused customer friction and more support time was needed for remediation. One poignant example Rick provides was when customers were inadvertently purchasing out-of-stock merchandise because of syncing issues between three of the core applications.

These and other issues were difficult to detect and troubleshoot with the basic log analysis capabilities they had access to at the time.

The Solution

The team’s initial stab at the problem started with improving visibility into the production environment. The team implemented a well-known SaaS log management solution to help manage their log data, but it fell short of the team’s expectations for a modern toolset.

“While it provided a good level of search and alerting capabilities, its analysis capabilities were lacking, like the data visualizations, anomaly detection, and log aggregation features available in Coralogix. We did look at other logging solutions but they were either lacking important features and were significantly more expensive. After narrowing down the options between Coralogix and a few others, Coralogix was really the obvious choice,” Rick says.

Coralogix immediately shifted the team into a more proactive approach. It helped the team better understand the causes behind production errors and provided the metrics needed to prioritize the growing list of errors.

Rick explains “Only a few days into using Coralogix when these features kicked into action, we quickly understood the list of issues we needed to address.”

Taylor Stitch uses Coralogix’s simple integration to S3 in order to archive all their logs forever and reindex when needed in a matter of minutes. One of TS’s pains with their previous solutions was slow response times and queries. Coralogix’s ability to serve its interface using Cloudfront provides a fast experience which completes Coralogix’s mission to provide faster queries for any time range.

The Impact

Even after having cleared out most of the team’s backlog of bugs, Rick explains that there were always new issues creeping up because of third-party app updates. One example was when the app responsible for customer rewards stopped updating balances correctly upon customer returns – leaving them with fewer points than deserved.

If it wasn’t for Coralogix, the team almost certainly would have only been alerted to the problem if a customer had noticed and taken the time to reach out, at which point the negative business effects would already be widespread.

Instead, the team was able to nip the problem in the bud because they were notified by Coralogix’s anomaly detection within 24 hours of their last deployment. The Coralogix ML-assisted technology intelligently tracked when the system component misbehaved without having to set up alerts beforehand.

“Coralogix anomaly detection alerted me that our system deviated from normal behavior. If I look at logs coming through it’s almost impossible to see an error unless I know what it is I’m looking for, but Coralogix is pulling out things without me necessarily searching for it, and emailing me within minutes of it happening” says Rick.

He adds “I now feel like I don’t need to worry as much because I know I’m going to be alerted to what needs fixing the moment it becomes an issue.”

Rick Davies

Rick Davies
CTO

I feel like I don’t need to worry as much now because we can see what needs fixing.

Summary

Having to deal with a large number of errors in their logs, Taylor Stitch’s engineers were having a hard time prioritizing issues. Coralogix solves the issue of having too many logs with Loggregation – a machine learning algorithm that instantly groups and aggregates logs based on their shared “templates”.

This new aggregated view gave the team a way to very quickly see which errors were relatively significant or not, allowing them to focus on solving the issues with the greatest business impact.

As Rick puts it, “because we were able to see the aggregated occurrences of errors for each log template, we understood which errors were significant relative to the other errors”.

With the guiding light of Coralogix, the Taylor Stitch engineering team was able to reduce its average daily errors by 97% leaving a lot more free time to invest in adding new value.

Rick was happy to report that “after 6 months we significantly improved the accuracy and efficiency of our networked applications. We’re fixing things we didn’t even know were wrong. We can now catch and fix issues much faster. It makes us more productive and frees up resources to work on new and improved business applications.”

Aside from building more cool stuff with the free time saved with Coralogix, Rick plans on taking a well-deserved vacation!

As the Taylor Stitch brand continues to grow, so will the demands placed on the engineering team. Coralogix will continue to be there to monitor and troubleshoot whatever challenge lies ahead.

Where Modern Observability
and Financial Savvy Meet.

Live Webinar
Next-Level O11y: Why Every DevOps Team Needs a RUM Strategy
April 30th at 12pm ET | 6pm CET
Save my Seat