The Internet of Things (or, IoT) is an umbrella term for multiple connected devices sharing real-time data, and IoT logging is an important part of this. Troubleshooting bug fixes, connection problems, and general malfunctions rely heavily on logs, making them an invaluable asset not only in designing systems but also in system maintenance.
To maximize system potential, this plethora of generated data needs to be managed efficiently. In this post, we’ll look at the different types of logs involved in IoT logging, different storage options and some common issues you may face.
Types of Logs
IoT logging has many different flavors. Some are asynchronous and need to be stored only periodically whereas others need to be synchronous to ensure device uptime. Below are some of the many types of logs involved in IoT logging.
Status logs show the state of the device and whether it is online, offline, transmitting, or in an error state. They are important to give the user a holistic picture of the general state of the device(s). They’re usually stored and sent in frequent and regular intervals.
Error logs are more specific than the status log and should generally trigger an alert for monitoring purposes. Errors mean downtime and that should be avoided. A good error log should provide contextual information such as what caused the error and where it occurred (a particular line of code, for instance). Error logs are usually asynchronous and sent whenever there is an error (provided internet connectivity has not been hindered).
Authentication logs enable you to see if a registered user(s) is logged in or not. It may be unfeasible to store each login attempt (as end-users might log in multiple times a day), but unsuccessful login attempts can be monitored to determine who is trying to gain access to the system/device.
Device attributes are pertinent to keep track of in case of future updates and bug fixes. A configuration log helps track all the different attributes for various IoT devices. This may not be useful for the end-user but it could be of vital importance for developers. If the configuration only really changes with a software update then it is worth storing and retrieving configuration logs asynchronously (i.e., with each update or downgrade).
If you have a software crash, a memory dump or crash dump is particularly useful to determine what went wrong and where. In Microsoft Windows terminology, a memory dump file contains a small amount of information such as the stop message and its data and parameters, a list of loaded drivers, the processor context for the processor which stopped, and so on.
IoT Logging Storage
Given that many of these IoT logging types are needed retroactively, the next question is about where the logs will be stored. You have two options here, local (on-device) storage or cloud storage. Both have their own merits and may be more or less suitable depending on the situation.
On-device storage of logs is a highly scalable approach, only in as far as the number of devices is concerned. It is not affected by the number of devices as each device saves its own logs on local storage. This also means that each device will need manual intervention if there is downtime or if it runs out of memory for log storage.
Furthermore, storing logs locally requires a physical connection to a remote computer or bridge for download/upload of data. This may impact user perception of the device and may not be possible if devices cannot be accessed easily or if there are many devices.
Cloud storage is the preferred option if you want immediate feedback and timely information about device status and performance. This approach is more scalable but relies on the existence of a fully functional log management system.
The log management system should be able to aggregate data from many heterogeneous devices transmitting in real-time and process, index, and store them in a database that facilitates visualization through charts, dashboards, or other means.
Common Problems with IoT Logging
With many devices transmitting data over potentially unstable connections, guaranteeing a certain level of Quality of Service (QoS) becomes a real challenge. If you cannot get vital information about device downtime promptly, then the QoS rapidly declines. Below are some commonly encountered logging issues that arise with IoT devices.
Lack of internet connectivity is among the most commonly encountered IoT logging issues. There could be many reasons for this including network congestion, lack of bandwidth, poor connection with wireless devices, and firewall issues. Moving the device to an area with better Wi-Fi strength, an antenna upgrade, and limiting the simultaneous number of connections (MAC address filtering) can help solve some of these issues.
Log buffering for IoT devices is important, especially in instances when the network drops. Determining the right size for your log buffer is just as important, as it can have serious implications when issues arise. A smaller log buffer saves storage, but will contain fewer log messages which can impact your ability to troubleshoot network issues.
Latency can have far-reaching consequences, especially when it comes to system maintenance. In cases where a cyclic status message is received a few hours late, it can impact your ability to correctly troubleshoot an issue. To get around this, the device latency can be calculated by subtracting the server latency from end-to-end latency. This can help illustrate if the problem is with the device or with the server.
IoT logging is a vital part of any system. Its function in system development and debugging cannot be understated. Using a centrally managed logging system for IoT devices has many advantages and can go a long way towards ensuring device downtime is kept to a minimum.
Coralogix provides a fully managed log analytics solution for all of your IoT logging requirements. Tools like Loggregation for log clustering, benchmark reporting for build quality, and advanced anomaly detection alerts are all features to help you run an efficient and stable IoT system.
Minimal downtime is one of the hallmarks of a great product/service and a functioning and Coralogix can help achieve it.