Installing & Configuring Node Exporter-Prometheus-Grafana-AlertManager-Blackbox
Prometheus is a monitoring solution for storing time series data like metrics.
Grafana allows visualizing the data stored in Prometheus (and other sources).
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
Node Exporter
Go to the home folder
cd
Download node Exporter from the below link
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
extract the downloaded file
tar -xvf node_exporter-1.3.1.linux-amd64.tar.gz
rename the folder
mv node_exporter-1.3.1.linux-amd64 node_exporter
add no shell user so all can run under this user
sudo useradd --system --no-create-home --shell /usr/sbin/nologin prometheus
go to folder
cd node_exporter
move node_exporter binary file to /usr/local/bin/
mv node_exporter /usr/local/bin/.
change ownership and permission of the moved file
sudo chown prometheus:prometheus /usr/local/bin/node_exporter
check the version if you need
node_exporter --version
create service file
sudo vim /etc/systemd/system/node_exporter.service
enter the below text to a file and save it
[Unit]
Description=Node exporter to collect machine metrics
[Service]
Restart=always
User=prometheus
ExecStart=/usr/local/bin/node_exporter
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
If you wish to change the port of the node exporter use the below text in the service File
[Unit]
Description=Node exporter to collect machine metrics
[Service]
Restart=always
User=prometheus
ExecStart=/usr/local/bin/node_exporter --web.listen-address=:9500
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
Reload service daemon
sudo systemctl daemon-reload
Start the node_exporter and enable it to run at bootup and check the status
sudo systemctl restart node_exporter.service
sudo systemctl enable node_exporter.service
sudo systemctl status node_exporter.service
Open port 9100
ubuntu
iptables -A INPUT -p tcp --dport 9100 -j ACCEPT
CENTOS or RHEL
iptables -I INPUT -p tcp --dport 9100 -j ACCEPT
Node Exporter will be running in port 9100
http://localhost:9100/ or http://<IP ADDRESS>:9100/
You can find the simple installation script on my GitHub page
Prometheus Installation
Download a tar file
wget https://github.com/prometheus/prometheus/releases/download/v2.36.1/prometheus-2.36.1.linux-amd64.tar.gz
extract a file
tar -xvf prometheus-2.36.1.linux-amd64.tar.gz
rename a folder
mv prometheus-2.36.1.linux-amd64 prometheus
move to /etc/ folder
mv prometheus /etc/prometheus
add a directory for Prometheus to store data
mkdir /etc/prometheus/data
change permissions
chown -R prometheus:prometheus /etc/prometheus
chmod -R 755 /etc/prometheus
edit prometheus.yml file
vim /etc/prometheus/prometheus.yml
add below config which is the default
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets:
- 'localhost:9093'
rule_files:
- alert.rules.yml
scrape_configs:
- job_name: "prometheus_master"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9090"]
- job_name: 'Nodes'
scrape_interval: 5s
static_configs:
- targets: ["localhost:9100"]
- job_name: 'Site Monitoring'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://www.google.com
- https://www.google.co.in ## add your websites here
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
- job_name: 'Internal Service Monitoring via Tcp '
scrape_timeout: 15s
scrape_interval: 15s
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets:
- localhost:9090 ## add your tcp connections here
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 127.0.0.1:9115
below is the alert.rules.yml
groups:
- name: Alert
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
- alert: LoadAverage5m High
expr: node_load5 >= 5.0
for: 3m
annotations:
summary: "Instance {{ $labels.instance }} - high load average"
description: "{{ $labels.instance }} (measured by {{ $labels.job }}) has high load average ({{ $value }}) over 5 minutes."
- alert: MemoryUsage
expr: 100 - ((node_memory_MemAvailable_bytes * 100) / node_memory_MemTotal_bytes) >= 95
for: 5m
annotations:
summary: "Instance {{ $labels.instance }} - high memory usage"
description: "{{ $labels.instance }} (measured by {{ $labels.job }}) has high memory usage ({{ $value }}) over 5 minutes."
- alert: DiskUsage
expr: 100 - ((node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"} * 100) / node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}) >= 90
for: 5m
annotations:
summary: "Instance {{ $labels.instance }} - high disk usage"
description: "{{ $labels.instance }} (measured by {{ $labels.job }}) has high disk usage ({{ $value }}) over 5 minutes."
- alert: Host-Unusual-Network-Throughput-In
expr: sum by (instance) (rate(node_network_receive_bytes_total[2m])) / 1024 / 1024 > 100
for: 5m
labels:
severity: warning
annotations:
summary: Host unusual network throughput in (instance {{ $labels.instance }})
description: "Host network interfaces are probably receiving too much data (> 100 MB/s)\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
- alert: Host-Unusual-Network-ThroughputOut
expr: sum by (instance) (rate(node_network_transmit_bytes_total[2m])) / 1024 / 1024 > 100
for: 5m
labels:
severity: warning
annotations:
summary: Host unusual network throughput out (instance {{ $labels.instance }})
description: "Host network interfaces are probably sending too much data (> 100 MB/s)\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
move the Prometheus file to /usr/local/bin/
mv /etc/prometheus/prometheus /usr/local/bin/.
create a systemd service file
sudo vim /etc/systemd/system/prometheus.service
add the following
[Unit]
Description=Monitoring system and time series database
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /etc/prometheus/data \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
LimitNOFILE=8192
[Install]
WantedBy=multi-user.target
reload the daemon
sudo systemctl daemon-reload
start, enable and check the status of the service
sudo systemctl restart prometheus.service
sudo systemctl enable prometheus.service
sudo systemctl status prometheus.service
Open port 9090
ubuntu
iptables -A INPUT -p tcp --dport 9090 -j ACCEPT
CENTOS or RHEL
iptables -I INPUT -p tcp --dport 9090 -j ACCEPT
Prometheus will be running in port 9090
http://localhost:9090/ or http://<IP ADDRESS>:9090/
Grafana Installation
Run the following command for ubuntu
sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/oss/release/grafana_8.5.5_amd64.deb
sudo dpkg -i grafana_8.5.5_amd64.deb
Run the following for the RPM-based servers
wget https://dl.grafana.com/oss/release/grafana-9.3.1-1.x86_64.rpm
sudo yum install grafana-9.3.1-1.x86_64.rpm
start and enable it
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable grafana-server
systemctl start grafana-server
systemctl status grafana-server
Open port 3000
ubuntu
iptables -A INPUT -p tcp --dport 3000 -j ACCEPT
CENTOS or RHEL
iptables -I INPUT -p tcp --dport 3000 -j ACCEPT
open in browser
http://localhost:3000/ or http://<IP ADDRESS>:3000/
No go to the dashboard and enter the credentials
default user : admin
default pass : admin
change the password at first login.
In Grafana add a data source
settings > data source > add Datasource — select Prometheus data source
Assuming Prometheus is on the same server, add the URL as http://localhost:9090/
Click on save and test
Go to dashboard > browse > Import - enter 1860 > import > select data source > save
Enjoy visualizing metrics ….!!!
Setting Up Alert manager
Download Alert-manager
wget https://github.com/prometheus/alertmanager/releases/download/v0.24.0/alertmanager-0.24.0.linux-amd64.tar.gz
Extract it
tar -xvf alertmanager-0.24.0.linux-amd64.tar.gz
move alertmanager
mv alertmanager-0.24.0.linux-amd64 /opt/alertmanager
change permissions
cd /opt/alertmanager/
chown prometheus:prometheus -R /opt/alertmanager
mv alertmanager binary
cp /opt/alertmanager/alertmanager /usr/local/bin/
cp /opt/alertmanager/amtool /usr/local/bin/
Add systemd service
vim /etc/systemd/system/alertmanager.service
Add below content
[Unit]
Description=Alertmanager
Wants=network-online.target
After=network-online.target
[Service]
User=alertmanager
Group=alertmanager
Type=simple
WorkingDirectory=/opt/alertmanager/
ExecStart=/usr/local/bin/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --web.external-url http://0.0.0.0:9093
[Install]
WantedBy=multi-user.target
save and exit
edit alertmanager configs
vim /opt/alertmanager/alertmanager.yml
Add below config
route:
group_by: [Alertname]
# If an alert isn't caught by a route, send it slack.
repeat_interval: 1h
receiver: slack_general
routes:
# Send severity=slack alerts to slack.
- match:
#severity: slack
receiver: slack_general
receivers:
- name: slack_general
slack_configs:
- api_url: https://chat.alerting.com/hooks/ksgdfbshdfsjhfw
channel: '#alerts'
email_configs:
- to: 'alert@alerting.com,alerts@alertng.com'
from: 'alert@alert.com'
smarthost: smtp.gmail.com.com:587
auth_username: "alert@alert.com"
auth_identity: "alert@alert.com"
auth_password: "password"
send_resolved: true
run below commands
systemctl daemon-reload
systemctl status prometheus
systemctl restart alertmanager
systemctl status alertmanager
Setting Up BlackBox
Download Latest BlackBox Exporter
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.22.0/blackbox_exporter-0.22.0.linux-amd64.tar.gz
Extract it
tar -xvf blackbox_exporter-0.22.0.linux-amd64.tar.gz
Copy BlackBox binary to /usr/local/bin
cp blackbox_exporter-0.22.0.linux-amd64/blackbox_exporter /usr/local/bin/blackbox_exporter
change permissions
chown prometheus:prometheus /usr/local/bin/blackbox_exporter
Remove old files
rm -rf blackbox_exporter-0.22.0.linux-amd64*
make a new directory
mkdir /etc/blackbox_exporter
Add the Config file
vim /etc/blackbox_exporter/blackbox.yml
Add the following
modules:
http_2xx:
prober: http
timeout: 15s
http:
fail_if_not_ssl: true
fail_if_ssl: false
method: GET
no_follow_redirects: false
preferred_ip_protocol: ip4
valid_http_versions:
- HTTP/1.1
- HTTP/2.0
http_post_2xx:
prober: http
http:
method: POST
tcp_connect:
prober: tcp
pop3s_banner:
prober: tcp
tcp:
query_response:
- expect: "^+OK"
tls: true
tls_config:
insecure_skip_verify: false
grpc:
prober: grpc
grpc:
tls: true
preferred_ip_protocol: "ip4"
grpc_plain:
prober: grpc
grpc:
tls: false
service: "service1"
ssh_banner:
prober: tcp
tcp:
query_response:
- expect: "^SSH-2.0-"
- send: "SSH-2.0-blackbox-ssh-check"
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp:
prober: icmp
Add service file
vim /etc/systemd/system/blackbox_exporter.service
Copy & paste the following
[Unit]
Description=Blackbox Exporter Service
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/bin/blackbox_exporter \
--config.file=/etc/blackbox_exporter/blackbox.yml \
--web.listen-address=0.0.0.0:9115
Restart=always
[Install]
WantedBy=multi-user.target
Change the permissions of a config file
chown prometheus:prometheus /etc/blackbox_exporter/blackbox.yml
Restart services
systemctl daemon-reload
systemctl start blackbox_exporter
systemctl status blackbox_exporter
systemctl enable blackbox_exporter
You're good to GO........!
Note: You can also add the BlackBox exporter dashboard to grafana dashboard with dashboard ID 7587 with Prometheus data source