Grafana monitoring with Docker. Part 1 - Logs with Loki

Effective monitoring shouldn’t require complex infrastructure. In this guide, Noveo Senior Developer Andrey walks through setting up Grafana’s monitoring stack with Docker—starting with Loki for centralized logs. Whether you’re debugging an application or looking for better system visibility, this practical approach balances simplicity and power, even for smaller deployments. Let’s break it down step by step.

Intro

Grafana monitoring stack, consisting of Alloy, Loki, Prometheus, and Tempo, is a distributed modern monitoring system built in Go, intended for collecting monitoring information from backend applications running on one or more servers.

It is also usable for monitoring mobile and desktop applications (it depends on the level of Opentelemetry support for your chosen language if we go with otlp at least. For example for Java opentelemetry reports a Stable level for both Traces, Metrics, and Logs, and thus full support is available for Android Java apps), and to some degree usable for web frontend client-side monitoring too.

__________

Note
The series of articles is written with the assumption we will be using Opentelemetry protocol as the main one for tracing related activities at least (but also abusing for metrics and logs too if necessary), but do know that Grafana Alloy tracing/logging/metrics collecting agent supports plenty of alternative protocols and your language support in them could be with some chance better than in opentelemetry.

__________

We will overview configuring this monitoring stack in a Docker-based approach for your homelab and for companies with a simplistic approach to infrastructure.

The article aims to make the monitoring system more accessible for a large number of people (their homelabs and basic production setups) and for this reason, we go with the Docker approach instead of the Kubernetes one.

If you run serious production with high load, it is better to run Grafana/Loki/Mimir (instead of Prometheus)/Tempo in Kubernetes instead, since its ecosystem with helm charts already made it easy to run it in a horizontal scalable way and able to take much larger workload.

The article will dive into configuring the monitoring with Docker-Compose and Opentofu (Terraform).

When in doubt, check Terraform-related code in infra repo for the source of truth as it is the version I run for my homelab.

It is worth it to configure this monitoring distributed stack even if you have only one backend application running at your servers (or even if you have only a mobile app). Well configured monitoring will grant you the ability to debug your application information significantly easier, and a well configured logging monitoring backend will give you the ability to filter data by any key/value in the logging records. It is possible to even build graphical dashboards based on Logging Information alone for overview of important information!

__________

Note
Grafana Loki became significantly more pleasant with the introduction in the 2024 year of a new Drilldown interface that simplifies navigation considerably.
The old "Explore" interfaces still have some usage cases left that new Drilldown interfaces do not cover yet, but the gap is quickly closing and for the Logging part I believe there is no big justification to open the old "Explore" interface any longer.

__________

Tip
I recommend you invest properly into other forms of monitoring like Metrics, as they help overview the healthy functioning of your application in a high high-performing way and bring you simplification in investigating problems raised from your next deployments.
Metrics' way of monitoring has plenty of open source solutions that bring them running out of the box for any type of infrastructure object.
It is also a good idea depending on your application needs to invest in Tracing for more deep transparency regarding its performance problems.
Configurations beyond Loki will be overviewed in separate next-part articles to keep the size of the current article to a reasonable time of comprehension.

__________

Tip
We can build graphical dashboards based on logs alone!
It is not efficient in comparison to using metrics, yes it is possible and necessary to be used as a last resort, or good enough to use in low-load systems.

Configuration

Getting server

You need to get somewhere a Linux server for deployment (it can be baremetal your own server, or it can VPN rented from some cloud provider).

I could recommend Hetzner server, due to the provider being very minimalistic and of a high quality with having quite low prices.

Its Arm64 prices for servers are looking to me like a killing feature.

CAX21 server (arm64, 4vcpu, 8gb ram) should be more than enough overkill for our purposes for homelab example purposes. You can squeeze things even into CAX11 (arm64, 2vcpu, 4gb ram) if desired, but be mindful, preferably to turn on Swap just in case as a fallback for insurance to handle the workload for everything put in at the start. (Usage of swap is not recommended for production at all, but for homelab in low load it should be fine)

Opentofu (Terraform) code is provided to configure things infrastructure as a code.

See this link for up-to-date code in case the article becomes outdated.

module "node_darklab_cax21" {
  source     = "../modules/hetzner_server"
  name       = "darklab"
  hardware   = "cax21"
  backups    = true
  ssh_key_id = module.ssh_key.id
  datacenter = "hel1-dc2"
}

Which utilizes code from this folder:
https://github.com/darklab8/infra/tree/master/tf/modules/hetzner_server

__________

Caution
I highly encourage you to attach Hetzner's firewall to the server as it is configured according to this code:
https://github.com/darklab8/infra/blob/master/tf/modules/hetzner_server/firewall.tf
And allow only traffic for 80 and 443 udp and tcp (for our caddy web reverse server), 22(tcp for ssh), and icmp for ping ports.

The configured cloud-level firewall ensures that in case you forget something about docker security, you have a nice fallback protecting your containers.
That is important with docker, which by default binds applications to 0.0.0.0 when using -p 8000:8000 exposure and it bypasses host-level firewalls like ufw.
Cloud-level firewall is your last safe net here in case of human error and misconfigurations.

__________

If you configure the server manually, please create a ssh key with the ssh-keygen command.
(Usually available right away in Linux as long as git is installed at least), you can make it available on Windows too if you open Git bash console that becomes available with installation of git.

Assuming you created everything correctly, you can make a record in ~/.ssh/config

Host homelab
  HostName 65.109.15.108 # replace with IP address shown in hetzner interface
  User root
  IdentityFile ~/.ssh/id_rsa.darklab # replace with name of your SSH key
  IdentitiesOnly yes

And connect to it by using the ssh homelab command. Once you connect and verify, you will see server insides and be ready for next steps:

$ ssh homelab
The authenticity of host '65.109.15.108 (65.109.15.108)' can't be established.
ED25519 key fingerprint is SHA256:mQ5+B+9e/1xn3GmRvd0pBnINxtjiLazwT8CMNvI7YcU.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '65.109.15.108' (ED25519) to the list of known hosts.
Welcome to Ubuntu 24.04.1 LTS (GNU/Linux 6.8.0-52-generic aarch64)

# bla bla bla, other long text

root@homelab-example:~#

Configuring DNS

Buy some domain for your server, so that we can have a nice address like https://homelab.dd84ai.com for the opening of the website later with TLS encryption in a named way. We can optionally use free DNS from deSec.

Create A record leading to the public IP of the server.

Raising docker containers

Once we get the server, we can proceed to the next step of configuring our monitoring stack.
We assume it will be served by Caddy for Let’s Encrypt and reverse proxy.

__________

Note
We assume you have installed Docker Engine and work from Linux.
Instructions may work for WSL2 with Docker Engine or Docker Desktop too, but not guaranteed.
With Docker available locally you will be able to apply instructions from this tutorial without being at the server directly.
Instructions for Docker Engine installations can be found here:
https://docs.docker.com/engine/install/ubuntu

If you used a Docker app image from Hetzner, then Docker is already installed on the server.
As a last resort, you can just execute the tutorial instructions directly on the server, just skip DOCKER_HOSTinstruction that will be mentioned next

__________

We configure with Docker-compose

__________

Note
For the convenience of working with some of the services running as docker swarm services for easy rotation of their image from CI, we utilize some swarm docker network (overlay) which requires running docker swarm init at your server.

__________

Tip
You can check Opentofu (Terraform) configuration in addition at:
https://github.com/darklab8/infra/blob/master/tf/modules/docker_stack/monitoring.tf

__________

Important
We provide a docker-compose way of configuration as a demo example because more devs are highly likely familiar and comfortable with docker-compose than with terraform.
We utilize terraform for configuration of it and recommend it to use instead of docker-compose if you can.
The book "Terraform up and running" is an excellent start.

docker-compose.yaml

version: "3.8"
services:
  caddy:
    image: lucaslorentz/caddy-docker-proxy:2.9.1
    container_name: caddy
    restart: always
    networks:
      - caddy
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - caddy_data:/data
    logging:
      driver: json-file # ensures logs from containers will not overfill server
      options:
        mode: non-blocking
        max-buffer-size: 500m
  grafana:
    build:
      dockerfile: ./Dockerfile.grafana
      context: .
    container_name: grafana
    restart: always
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
      - GF_SECURITY_ADMIN_USER=admin
      - GF_FEATURE_TOGGLES_ENABLE=alertingSimplifiedRouting,alertingQueryAndExpressionsStepMode
      - GF_INSTALL_PLUGINS=https://storage.googleapis.com/integration-artifacts/grafana-exploretraces-app/grafana-exploretraces-app-latest.zip;grafana-traces-app
    networks:
      - grafana
      - caddy
    volumes:
      - grafana_data:/var/lib/grafana
    logging:
      driver: json-file
      options:
        mode: non-blocking
        max-buffer-size: 500m
    labels:
      caddy_0: ${GRAFANA_DOMAIN}
      caddy_0.reverse_proxy: "{{upstreams 3000}}"
  loki:
    build:
      dockerfile: ./Dockerfile.loki
      context: .
    container_name: loki
    restart: always
    entrypoint: ["/usr/bin/loki"]
    command: ["-config.file=/etc/loki/local-config.yaml"]
    networks:
      grafana:
        aliases:
          - loki
    volumes:
      - loki_data:/data
    logging:
      driver: json-file
      options:
        mode: non-blocking
        max-buffer-size: 500m
    mem_limit: 1000m
  alloy-logs:
    build:
      dockerfile: ./Dockerfile.alloy.logs
      context: .
    container_name: alloy-logs
    restart: always
    networks:
      grafana:
        aliases:
          - alloy-logs
    entrypoint: ["/bin/alloy"]
    command: ["run","/etc/alloy/config.alloy","--storage.path=/var/lib/alloy/data"]
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    logging:
      driver: json-file
      options:
        mode: non-blocking
        max-buffer-size: 500m
    deploy:
      resources:
        limits:
          memory: 1000M

networks:
  grafana:
    name: grafana
    driver: overlay
    attachable: true
  caddy:
    name: caddy
    driver: overlay
    attachable: true

volumes:
  caddy_data:
    name: "caddy_data"
  grafana_data:
    name: "grafana_data"
  loki_data:
    name: "loki_data"

Starting using Grafana

If everything is working as expected, you can log in to Grafana using the username admin and the password you set in the GRAFANA_PASSWORD environment variable.

Now you can observe logs from all your running Docker containers.

Select the desired application and browse its logs easily by filtering specific log levels.

You can also use quick filtering options at the top of the panel—under the Labels and Log levels bars.
To filter by any text, use the “Search in log lines” menu and press Include to specify what you're looking for.

Important

Make sure your logs are emitted in JSON format!
Grafana’s logging interface will automatically recognize all JSON key-values as valid labels, making filtering much simpler.
However, for Explore and LogQL queries, you still need to explicitly define JSON format for it to work correctly.

A bit further down, we deploy simple application examples that we’ll use in more advanced scenarios.
Once deployed, try experimenting with:

Filtering logs by minimum duration
Switching between different applications
Filtering by specific URL patterns

In our case, we encountered a few errors in Caddy, and we filtered them using the error log level.

Dashboards with Loki

Dashboards that use Loki are not known for high performance. They can struggle in horizontally scaled environments with high log volume. So, Loki dashboards are more of a last-resort tool when you need insights that metrics alone can't provide—such as detailed values and their precision.

For high-load applications, it's better to configure Mimir/Prometheus with metrics and use Recording Rules to optimize performance.

For low-workload applications (e.g., single instances), Loki is typically sufficient performance-wise.

Sample Logging App

To demonstrate a web-like app emitting logs, we created a dummy app example.

bash

CopyEdit

export DOCKER_HOST=ssh://root@homelab
docker compose -f docker-compose.app-logs.yaml build
docker compose -f docker-compose.app-logs.yaml up -d

Creating a Dashboard with Loki

Now let’s create a dashboard using Loki as a data source in flexible “code” mode.

We’ll start with a LogQL query from the Metric Queries page.

Notice how we used the unwrap function to select specific numeric values to be used in formulas.

Example: Max Duration by URL Pattern (over 2m)

max_over_time({service_name="app-logs"} | json | duration > 0 | url_path!="" | unwrap duration [2m]) by (url_pattern)Note: The unwrap function is essential here—it extracts the numeric value we need for calculations.

Example: Count of Requests by URL Pattern (over 2m)

sum(count_over_time({service_name="app-logs"} | json | duration > 0 | url_path!="" [2m])) by (url_pattern)If you’re logging other fields—like user IPs, user agents, request/response body size—you can create charts grouped by these parameters.
This lets you see which endpoints use the most network traffic.

Example: 90th Percentile Duration (over 10m)

quantile_over_time(0.90,{service_name="app-logs"} | json | duration > 0 | unwrap duration [10m]) by (url_pattern)Average Duration

If you want to show average values, just use avg_over_time instead:

avg_over_time({service_name="app-logs"} | json | duration > 0 | unwrap duration [10m]) by (url_pattern)

Finishing the Dashboard

After assembling the graphs:

Set proper titles
Change units to seconds for duration-based charts
Optionally use bar charts instead of line charts
Enable a legend in table mode showing Last/Mean values

This results in a much easier-to-navigate dashboard compared to raw logs.

A final version of the dashboard is provided for optional import: dashboard_app_logs.json

What’s Next?

That’s it for the first part of setting up Grafana + Loki + Alloy.
In the next articles, we’ll cover:

Metrics
Traces
Alerts

In the meantime, try playing around with the logging interface—filter logs in different ways and switch between services to get comfortable with it.

You’ll find updated versions of these articles and the next parts here.