Day 75 — Sending Docker Log to Grafana
Day 75 of our 90-day journey marks a pivotal moment in our quest to optimize Docker logging by integrating Grafana, a powerful open-source analytics and monitoring solution. This article dives into the intricate process of seamlessly transmitting Docker logs to Grafana, providing insights and strategies to fortify your logging infrastructure and bolster your system’s observability.
In this blog post, we show how to use Loki to monitor the logs of an application deployed with docker-compose.
A typical Loki stack consists of:
- Loki itself, the log database (this would be the equivalent of Elasticsearch);
- Grafana, the visualization web interface (equivalent of Kibana);
- Promtail, that allows to scrape log files and send the logs to Loki (equivalent of Logstash).
As we will see below, Promtail can also be used to directly scrape logs from Docker containers.
Pre-requisites
- Install Docker and create docker images for applications.
In the above screenshot, we have created notes-app & django-cicd images.
Create two containers out of images.
docker run -d -p 8000:8000 radheyzunjur/devops-project-node-todo-cicd:latest
docker run -d -p 8001:8001 trainwithshubham/my-note-app:latest
Let’s check the total number of containers on the server.
1. Configure datasource in Grafana
Add loki as datasource as we have done in previous article
Click on “Save and Test”. You will get a prompt saying “Data Source Successfully Connected” so you can conclude that your connection to Grafana using Loki and Promtail is successful.
Now click on Explore to create metrics.
We will create metrics that will show logs containing Docker in it.
Name -> System Generated Logs
Label Filters -> jobs, varlogs
Line Contains -> docker
Click on run query
Once you get the output we can put the result into a new dashboard, before that named it.
Configuring Telegraf
Telegraf is a powerful data collection tool that plays a crucial role in collecting and aggregating metrics and logs for monitoring, observability, and performance analysis. It provides the foundation for building scalable and efficient monitoring solutions in various environments.
Telegraf will be integrated into a database. Telegraf will collect all the docker logs then those will be inserted into the table in the database. Here we are using InfluxDB.
- lnstall Telegraf on the EC2 instance through apt package.
sudo apt install telergaf
2. Check if the telegraf service is running.
sudo systemctl status telegraf
3. Configure telegraf to send to collect the docker logs by changing the telegraf configuration file.
sudo vi /etc/telegraf/telegraf.conf
Copy the configuration as below
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
gather_services = false
container_names = []
source_tag = false
container_name_include = []
container_name_exclude = []
container_name_include = []
container_name_exclude = []
perdevice = true
perdevice_include = ["cpu"]
total = false
timeout = "5s"
total_include = ["cpu","bikio","network"]
tag_env = ["JAVA_HOME","HEAP_SIZE"]
4. Restart the telegraf service and check the status of the service.
sudo service telegraf restart
sudo service telegraf status
Configuring InfluxDB
By utilizing InfluxDB as the backend storage for Telegraf, we can effectively store and analyze time series data, enabling monitoring, performance analysis, and observability of our systems and applications.
- Install influxdb in the EC2 instance through ubuntu apt package.
sudo apt install influxdb
2. Open the InfluxDB shell by running the following command in your terminal:
influx
3. Once you are in the InfluxDB shell, execute the following command to create the “telegraf” database:
CREATE DATABASE telegraf
4. Navigate to telegraf config file and enable the database configuration to connect telegraf to influxdb.
vim /etc/telegraf/telegraf.conf
[[outputs.infuxdb]]
urls =["http://54.173.43.117/:8086"]
database = "telegraf"
5. Restart the telegraf service to reflect the changes.
Creating Dashboard
We will create a dashboard having below data:-
- Total Containers.
- Running Containers.
- Stopped Containers.
- Images.
- Containers memory.
- Containers uptime.
We will start one by one from the above.
Total Containers
- As the below screenshot shows, make the appropriate settings to reflect the total containers in a stat form.
- Choose the data source as influxdb. Grafana will collect the data from influxdb.
- In the FROM section select docker. This will configure docker to the dashboard.
- In the SELECT section choose n_containers. This will show the total number of containers on the server
We can choose the colour of the graph.
Running Containers
As the below screenshot shows, make the appropriate settings to reflect the total containers running in a stat form.
- Choose the data source as influxdb. Grafana will collect the data from influxdb.
- In the FROM section select docker. This will configure docker to the dashboard.
- In the SELECT section choose n_containers_running. This will show the total number of containers running on the server.
A name can be given for this graph in the Panel options.
We can choose the colour of the graph.
Stopped Containers
As the below screenshot shows, make the appropriate settings to reflect the stopped containers in a stat form.
- Choose the data source as influxdb. Grafana will collect the data from influxdb.
- In the FROM section select docker. This will configure docker to the dashboard.
- In the SELECT section choose n_containers_stopped. This will show the total number of containers running on the server.
Make the graph red and change the name of the graph.
Images
As the below screenshot shows, make the appropriate settings to reflect the total images on the server in a stat form.
- Choose the data source as influxdb. Grafana will collect the data from influxdb.
- In the FROM section select docker. This will configure docker to the dashboard.
- In the SELECT section choose n_images. This will show the total number of containers running on the server.
Make the graph blue and change the name of the graph.
Containers memory
As the below screenshot shows, make the appropriate settings to reflect all container’s memory on the server in a stat form.
- Choose the data source as influxdb. Grafana will collect the data from influxdb.
- In the FROM section select docker. This will configure docker to the dashboard.
- In the SELECT section choose field(usage_percent) &
- last(). This will show usage percentage of the containers last used according to the time set.
- In GROUP_BY choose time($__interval)
- tag(container_name::tag)
- fill(null)
- This will group all the containers an show corresponding values.
- In FORMAT AS choose
- Time series
- ALIAS $tag_container_name
- This will show the tags for the graph. The tags are nothing but the container names displayed on below the graph indicating which graph is for which container.
We can beautify the graph using below settings.
Containers uptime
As the below screenshot shows, make the appropriate settings to reflect all container’s uptime on the server in a stat form.
- Choose the data source as influxdb. Grafana will collect the data from influxdb.
- In the FROM section select docker. This will configure docker to the dashboard.
- In SELECT choose
- field(uptime_ns)
- last()
- alias(uptime_ns)
- This will display the uptime of each container currently running on the server.
- In GROUP BY choose
- tag(container_name::tag)
- This will group all the containers to show containers uptime
- In FORMAT AS choose Table to veiw the details in a tabular form.
Choose the Overrides setting in Table option. create a override and choose Field with name followed by uptime_ns. Then choose Standard options > Unit followed by nanoseconds(ns). This will show the uptime in nanoseconds.
Final Dashboard
Finally, add all the individual unit graphs and tables to the dashboard.