Introduction
A monitoring stack is only useful when it is deployed consistently and remains easy to maintain over time. Manual installations and one-off configurations tend to drift quickly, which makes upgrades and troubleshooting harder than they need to be.
In this series, we build the stack component by component using Ansible roles. This first post focuses only on Prometheus Node Exporter, which is a lightweight collector that exports and exposes a wide variety of hardware- and kernel-related metrics. We will go over the installation process, running as a hardened systemd service, and exposing the metrics on port 9100 for later scraping by Prometheus.
Node Exporter
It is recommended to run Node Exporter directly on the host itself instead of a container since the exporter needs to access kernel-related metrics. Although it is possible to run in a container, the /proc, /sys and other directories have to be mounted for it to function properly. In this post, we will focus on host-based set up for the exporter.
The role is broken down into four phases:
- Preparation: Create user, group and working directory
- Installation: Download and install the Node Exporter binary
- Configure: Create and manage
systemdservice for the exporter - Cleanup: Remove temporary artifacts
Core Variables
Rather than hard-coding installation details into the role, we keep the behavior driven by defaults/main.yml. That way, we get a clean base role that can be reused across hosts and environments with only variable overrides.
The core variables for this role are defined as:
| |
Preparation
It is worth setting up the environment before directly jumping into the installation. That way, we avoid cluttering our host and maintain a clean state. The environment will also help us in achieving:
- Security: The Node Exporter should not run as root or with escalated privileges. Instead, we will create a system user and group for the exporter.
- Isolation: All operations such as download, extractions, etc. will reside in a temporary work directory, keeping the host system clean and unaffected.
| |
These tasks ensure Node Exporter runs as a non-privileged system user and the work directory provides the isolation layer by keeping all temporary files contained.
Installation
Download
We can use Ansible’s built-in get_url module for downloading the release archive:
| |
As an added layer of safety, especially in production, add a
checksumvalue to verfy the downloaded archive
The values for url and dest are derived variables, defined in the defaults/main.yml as:
| |
Unarchiving
The downloaded release is a Tar file, so we cannot use it directly. Instead, we will be using Ansible’s built-in unarchive module:
| |
Setting remote_src: true tells Ansible that the archive already exists on the remote host, so the unarchiving process happens on the remote host itself.
Installation
The final step in this phase is installing the exporter binary. We defined our helper derived variables in defaults/main.yml:
| |
/usr/local/bin/would make Node Exporter available system-wide and is the recommended installation path
We copy the exporter from our temp work directory to the /usr/local/bin directory:
| |
Setting
mode: "0755"ensures the binary is executable by all users while remaining writable only by the owner.
Notice the notify part at the end of the task. A change in binary ensures the service is only restarted when the binary actually changes. The handler Restart node_exporter is implemented in handlers/main.yml:
| |
Configure
At this point, we managed to download and extract the exporter’s binary. The next step is to create systemd unit for the exporter. Using systemd ensures the exporter is managed consistently, starts on boot, and can be controlled via standard system tooling.
This can be done through a two-step process:
- Create the
systemdunit file - Enable and start the service
Systemd Unit File
The systemd unit file will be implemented as a jinja template. The file templates/node_exporter.j2.service:
| |
Additional
systemdoptions such asNoNewPrivilegesandProtectSystemare used to provide basic service hardening and reduce the potential attack surface.
| |
The variables needed here can be defined as:
| |
Cleanup
The final stage for this role is to perform cleanup. Since all intermediate artifacts are stored in a single working directory, it can be removed entirely through a single task:
| |
Main.yml
The role ties the phases together in tasks/main.yml:
| |
This now finally completes our Node Exporter role.
Example Playbook
The final step is to actually use our role and run it on our target hosts. To do that, we will create playbooks/node_exporter.yml:
| |
Then we run our playbook using the command:
| |
Installing a Different Version
You can override the version, 1.10.2, without doing any modifications to the role. For example, if we want to use 1.10.0 instead, then we override the node_exporter_version and node_exporter_checksum values:
| |
Verification
Once the playbook finishes execution, we can verify the exporter is successfully configured and setup by:
- Checking the service is running
| |
- Confirm the exporter is listening on port
9100
| |
- Query the endpoint
| |
We can also create a dedicated Ansible task for performing the verification for us, making it more suitable for a production environment.
Conclusion
In this post, we built a reusable Ansible role for deploying Node Exporter. By structuring the role into clear phases and driving it through variables, we ensure maintainability and consistency across environments. The role is also written to be easily adjustable by modifying variables without changing the internal implementation.
The complete source code for this post can be found on GitHub.
In the next post, we will build on this foundation by deploying Prometheus and configuring it to scrape metrics from our exporters.