AWS CloudWatch Logs — 101
Im going to bring you a quick summary of CloudWatch Logs that will allow you to play around with it. Also, this will be a new group of posts to cover different questions people have about CloudWatch.
AWS CloudWatch is the service in charge of monitoring through different ways the status of our applications, databases, Lambda functions, queues, and everything in AWS and also, reacting over those statuses.
CloudWatch has internally many sub-services dedicated to a specific monitoring task. We can store logs, create dashboards, set alerts, and more. We’ll pass through part of the capabilities of CloudWatch.
Quickly, before starting… Why we should keep our services monitored?
Building software requires more than functional code, many tools and packages are daily added to many applications to impact either Business Logic or bug resolutions, but, beyond the application layers, we should keep in mind our application needs to communicate what’s happening inside.
At some point, We’ll need to understand, evaluate and improve the flows in our systems, and maybe monitoring is not the first thing we think about when we’re building a system, but it's one of our best options to know our application.
Log Groups and Logs Stream
To start diving into AWS CloudWatch we should know about those terms, Logs Groups, and Logs Streams which are the bases of all features related to Logs.
AWS groups by batches the logs generated by an application, those batches are called LOG STREAMS. They are grouped based on the date or minutes when the log was outputted. All the Logs Stream belongs ****to a Log Group, which finally is the sum of all the logs in all the history of an application.
Each Log Group belongs to an application if the logs are enabled, so, every lambda, or ECS cluster, should have its own unique Log Group name.
For example, if a Lambda Function has 10 Aliases, you can see 1 single log group because is only one resource but inside, you can see 10 or more Log Streams based on each alias.
Smart Searching with CloudWatch Logs
Once you know how are grouped and distributed the logs, you can start searching and debugging things through the Logs, there are different operations to play with the search bar in CloudWatch:
- Exact Match: Using
""
we can search for specific logs in the range date of our preferences. - Optional Match: Using the character
?
before the word, we can set this as optional. Example:ERROR ?404 ?409
what retrieves all the logs with ERROR word along with either 409 or 404. - Exclude (All/Except): To skip a word in the Match we can set the
-
character. See this Search pattern as an example:ERROR -404 -409
we’re looking for all ERROR strings but skipping all containing404
and409
strings. - See more examples in AWS docs.
There are more filters in the AWS Documentation, such as searching into a JSON log or creating metrics based on logs.
Improve your Search capabilities with Logs Insights
Logs insights is an advanced way to search through logs of different services or applications, all in a single place.
If your services are backed by EC2, Lambdas, ECS, and others, we can select them in Logs Insights and run a general search for all the services you chose. Also, we search here using a kind of SQL or querying language.
To Check logs in Logs insights click on “Logs Insights” at the left bar of the AWS CloudWatch Dashboard. I’ll give you an example of how to group logs and create the query.
Running Queries:
Select the functions to search: Click on “Select log Group(s)” and search for all the LogGroups that you want to use. Keep in mind, each log group belongs to a single application/function/instance.
Create the Query: The query syntax of Log Insights support different operations such as filter, counts, groups, and more.
This example allows counting how many logs of a message exist. You can choose the range time over the query box.
filter @message like /Ignoring webhook/
| stats count() by bin(1d)
More Operations: We can set operations for the fields such as substring for the results of the query, in this example, we’re looking at either two numbers using regex, and the result also gets the timestamp, the very first 66 characters of the log message, and the log uuid with substr(@log,44)
.
fields @timestamp, substr(@message,66) , substr(@log,44)
| filter @message like /143883|4932956029033/
| sort @timestamp desc
| limit 1000
Check more query examples in the AWS Documentation.
4. Select the Time Range for your query: At the top of the Query box. You can check as old range dates as you have available in the log groups And RUN YOUR QUERY 🔥
CloudWatch Alarms
Alarms are the way to get notified about what’s happening in our System. Those Alarms can be connected to SNS Topics or even start autoscaling processes, and all the world of options with SNS Capabilities. But in this case, let’s get focus on CloudWatch Alarms.
The Alarm is based on metrics. They can be metrics generated by AWS that are available in all the services such as ECS CPU or Memory, RDS Free-able Memory or RDS Connections, and so on. We can also create Metrics based on our logs, searching specifically in our log groups and getting a metric to create a dashboard or alarms.
I strongly suggest you know about the basics of SNS to discover all the possibilities with CloudWatch and SNS Topics
To create an alarm we just need to go to CloudWatch Dashboard and hit “All Alarms” in the left sidebar and Click on “Create Alarm”. But, If we want to create Alarms from Logs coming from our apps, we need to create Filters, Metric Filters!
Logs + Alarms: Metric Filters
Metric filters allow you to convert a search pattern in CloudWatch Log into a Numeric statistic that you can easily connect with other options such as CloudWatch Alarms or Dashboards. This is how AWS Allows us to set Metrics and Alarms based on “text”.
Metric filters always will search into a LogGroup, you cannot set it up to search only in a Log Stream, since the logs streams only keep partial data, the log group contains all the logs available in AWS for an application.
Also, we’re able to get notified by a Log, this is the way to do it:
- Go to the Log Group you want to create an alert and click on “Search All Logs”
- Type in the search bar the text you’re looking for using any of the filters mentioned before.
- Once you get the result, click on Create Metric Filter.
- To create the metric AWS must know how to manage the filter. We can set a unit label and the value for every message found. See the image:
- Important Fields:
- Metric value: How to count if there is a match.
- Default Value: Behavior when nothing is found, recommended 0.
- Unit: In this case, we can set Mb/Kb/Gb/KbPerSeconds and more.
5. Click on Create. You can get a list of your metrics filters in the Log Groups Dashboard.
6. Click on the Filters URL shown in the previous image. Then Choose the metric filter you just created.
7. Select the metric filter and then click on Create Alarm
8. Define the threshold of the alarm.
- Click on next and choose the SNS you’ll use to get notified. If you don’t have an SNS Topic, click Create SNS Topic instead of choosing an old one.
- Click on Next and Set the Alarm Name and description.
- Click on “Create Alarm”.
As you can see, there are many possibilities of getting notified about any changes on your system and triggering whatever you want into your AWS Infrastructure. Once CloudWatch throws the alarm to SNS we can start lambda functions, worker processes, etc.
Keys:
- AWS CloudWatch Allows monitoring by providing Log Storage, Alarms, Dashboards, and Schedule Actions (We’ll cover this soon).
- Log Streams are batches of Logs. Log Groups keep the logs of one application together.
- We can filter our Logs based on different rules over a LogGroup.
- Logs Insights Allows you to Search over the logs from different services and applications and search using a Querying Language.
- Alarms Can Be set up based on Logs by using Metric Filters.
- Alarms can trigger System Actions like Scaling processes.
Thank you for reading this post on AWS CloudWatch Logs! We hope you found it informative and helpful. Stay tuned for more posts on monitoring and other AWS topics in the future. Goodbye for now!