Whether it’s Apache, Nginx, ILS, or anything else, web servers are at the core of online services, and web log analysis can reveal a treasure trove of information. These logs may be hidden away in many files on disk, split by HTTP status code, timestamp, or agent, among other possibilities. Web access logs are typically analyzed to troubleshoot operational issues, but there is so much more insight that you can draw from this data, from SEO to user experience. Let’s explore what you can do when you really dive into web log analysis.
Right now, online Internet traffic is exceeding 333 Exabytes per month. This has been growing year on year since the founding of the Internet. With this increase in traffic comes the increased complexity of operational observability. Your web access logs are crucial in the fight to maintain operational excellence. While the details vary, some fields you can expect in all of your web logs include:
These fields are fundamental measures in building a clear picture of your operational excellence. You can use these fields to capture abnormal traffic arriving at your site, which may indicate malicious activity like “bad bot” web scraping. You could also detect an outage in your site by looking at a sudden increase in errors in your HTTP status codes.
68% of online activity begins with a user typing something into a search engine. This means that if you’re not harnessing the power of SEO, you’re potentially missing out on a massive volume of potential traffic and customers. Despite this, almost 90% of content online receives no organic traffic from Google. An SEO-optimized site represents a serious edge in the competitive online market. Web access logs can give you an insight into several key SEO dimensions that will provide you with clear measurements for the success of your SEO campaigns.
42.7% of online users are currently using an ad-blocker, which means that you may see serious disparities between the traffic to your site and the impressions you’re seeing on the PPC ads that you host. Your web log analysis can alert you to this disparity very easily by giving you a clear overall view of the traffic you’re receiving because they are taken on the server-side and don’t depend on the client’s machine to track usage.
You can also verify the IP addresses connected to your site to determine whether Googlebot is scraping and indexing your site. This is crucial because it won’t just tell you if Googlebot is present but also which pages it has accessed, using a combination of the URI field and IP address in your web access logs.
Your web access logs can also give you insight into how your site performs for your user. This is different from the operational challenge of keeping the site functional and more of a marketing challenge to keep the website snappy, which is vital. Users make decisions about your site in the first 50ms, and if all they see is a white loading page, they’re not going to make favorable conclusions.
The bounce rate increases sharply with increased latency. If your page takes 4 seconds to load, you’ve lost 20% of your potential visitors. Worse, those users will view an average of 3.4 fewer pages than they would if the site took 2 seconds to load. Every second makes a difference.
Your web access logs are the authority on your latency because they tell you the duration of the whole HTTP connection. You can then combine these values with more specific measurements, like the latency of a database query or disk I/O latencies. By optimizing for these values, you can ensure that you’re not losing any of your customers to slow latencies.
Your web access logs may also give you access to the User-Agent header. This header can tell you the browser and operating system that initiated the request. This is essential because it will give you an idea of your customers’ devices and browsers. 52.5% of all online traffic comes from smartphones, so you’re likely missing out if you’re not optimizing for mobile usage.
Web access log analysis is one of the fundamental pillars of observability; however, the true challenge isn’t simply viewing and analyzing the logs, but in getting all of your observability data into one place to correlate them with one another. Your Nginx logs are powerful, but if you combine them with your other logs, metrics, and traces from CDNs, applications, and more, they form part of an observability tapestry that can yield actionable insights across your entire system.