Imagine this: You have your services up and running in production. Then you get a call. Something’s broken in production.
What’s the first thing you do? Where do you start so you can figure out if your services are in good working order?
Logs. The logs are your connection to your services. They should tell you what’s going right and what’s going wrong.
In this post, we’ll go over what logs are, why they matter, and most importantly, why you probably need a service to offload the burden of managing those logs.
What Is Logging?
Just to be clear on what logging is, let’s talk about why logging is necessary and how to make it as useful as possible.
Each component in your environment has, or should have, some form of communication. This is (in general) some sort of log. These logs should help communicate what the service is doing and provide a heartbeat that lets you know whether the service or component is doing its job.
Also, the logs should communicate whether the problem is the service itself. For example, is the service out of memory? Is it out of disk space?
Another problem that logs should communicate is whether the service is failing on an outside dependency. For example, maybe a service it’s trying to communicate with is either returning errors or returning corrupt data.
Whether the service is part of your cloud environment or not, logs are an important part of your system. It’s vital to make sure that logs in your services communicate useful information and help you diagnose problems quickly. With your services, make sure your logs are formatted in a way that allows someone or something to extract data quickly.
Why Would I Want a Logging Service?
Now that you have a source of logs, you or someone else has to aggregate them into a central place for an operator to extract information from them. These tools and processes grow over time, and are most likely undermaintained and flaky.
This is where logging as a service comes in. Logging as a service helps you get your logs into a central location. Also, it lets you quickly sift through those logs and find the signals among all the noise. This service makes finding the information you need quick and easy.
Other Benefits From a Centralized Logging Service
Another benefit when you put your logs into a central logging service is then you can control who and what you can read them. When logs are scattered about in different locations, and different methods of retrieving them. It’s hard to figure out who and what can access them because the number of different vectors one could read them. However, with a central logging service, you can take advantage of the one location and lock down those logs accordingly. Also, if your industry has compliance rules, you can ensure that you are following those rules in this service. For instance, Scalyr is ready to help you with GDPR.
Another benefit to a centralized logging service is being consistent across your services. If you take advantage of a microservice architecture, you will want to ensure that your logging in consistent and uniform so when you have to review the logs of services, it’s easy to decipher those logs across services. You will need to ensure that you have the right tools and libraries in place to make your job easier to achieve that.
One last benefit that you get from a centralized logging service is reliability. Because this service is focused on taking in logs and archiving the, it can focus on ensuring that it is doing that one job. This allows it to ensure it is scaled and redundant as deemed appropriate for your logging needs. Instead of you having to maintain another service, and spending precious time ensuring it is well maintained. You can offload that burden on a centralized service and focus on the other jobs at hand.
What Should I Look for in a Service?
What makes a good logging service? Well, you want to know that if you put your logs into this service, you can get the data you need out of them as soon as possible and without much difficulty. You have three pillars of observability to cover: logs, metrics, and tracing.
- Why do logs matter? They’re your source of information. You want to be able to view the logs and inspect them for information to help you solve whatever problem you’re facing. Scalyr has an excellent search mechanism that lets you slice and dice your logs. This way, you can find the information you need. Plus, since logs don’t have indexes, you don’t have to worry about the issue of adding fields to an index. Everything is searchable when you ingest them.
- What’s so important about metrics? With metrics, you want to be able to analyze how your system is performing. You can usually use this performance data as a “canary in a coal mine“—in other words, an early warning sign that a system is in trouble. If your metrics fall below certain thresholds, then logging as a service can alert you before your customers even notice. Scalyr can help you set up alerts on those thresholds and notify you through a number of different methods.
- Why should I be concerned about tracing? It helps you track down errors when they happen. As you build up services, and especially if you use a microservices architecture, you want to be able to see the flow of information or requests through your system. You need to be able to identify when a request went bad in your system so you can react and fix the problem.
Do I Log Everything, or Do I Try to Be Specific?
You want to log as much as you need without being verbose. If you take this approach, when there’s a problem, you can quickly identify it and solve it. This usually requires you to at least make sure your services have the correct logging information.
Also, you’ll want information on your platform, which will mean you’ll have to tap into their information streams to find out whether the platform is healthy and performing correctly. Most platforms out there have some kind of logging mechanism that you can monitor for this.
For instance, if you’re running on Kubernetes, you would want to set up logs to capture the logs from the Kubernetes system. This way, the system can help you diagnose if the problem exists at your service layer or at a platform level. For example, you could capture the logs when pods are created and destroyed to verify that resources are being allocated correctly.
Another log source from Kubernetes relates to whether persistent disk mounts are being created and attached correctly to pods. If disk problems start to arise, you receive an alert, and you can take action and resolve it before a bigger problem arises.
But What About the Costs?
One of the biggest resistant to a centralized logging service is cost. Just like moving to the cloud, the monthly bill can be shocking at times. But when you move to a logging service, you can offload some of that costs and get benefits from the scale that service provides.
For example, instead of you having to figure out how to scale your infrastructure to meet your logging needs, the logging service already has that for you. You can focus on providing value to your customer instead of having to maintain other systems that aren’t directly impacting your customers.
Conclusion
Logs are your window into your system. You need logs to get a sense of what’s going on and what problems may need to be solved. If there’s a problem, you need to be able to recover from it as quickly as possible.
Now, you could monitor the logs yourself, but your time is far too important to reinvent the wheel. There are services out there that can make logging a breeze. This blog post has hopefully outlined what makes logging as a service a worthwhile investment. I recommend taking Scalyr for a free spin.
Logging as a service is a cost-effective way to let you keep an eye on your platform. It can also help you recover faster. It can even help you react faster to events that need manual intervention. This kind of service is usually the backbone of your monitoring solutions, and it’s the first place to go when you need to reach for data when things go wrong. Make sure to invest correctly in a well-rounded system, and you’re likely to see dividends.