FilesystemMonitoring

From Run Your Own
Revision as of 17:25, 22 February 2025 by Brendan (talk | contribs) (→‎Filesystem Monitoring)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Filesystem Monitoring

If you have applications with a propensity to blow up it can be helpful to have an alert before you run out of disk space.

Thinks like mastodon and certain file-sharing systems can steadily eat up lots of space caching things that you might not need any more. There also the always present danger that logs or backup files can inflate to huge sizes due to unobserved errors or missing clean-ups. So let's make a little script that will send us an alarm email if it looks dicey:

  • save this script somewhere useful like disk-monitor.sh in your private bin folder
#!/bin/bash
# set -x
# Shell script to monitor or watch the disk space
# It will send an email to $ADMIN, if the (free available) percentage of space is >= 90%.
# --------------------------------------------------------------------------------------------------------
# Set admin email so that you can get email.
ADMIN="admins@example.com"
# set alert level 90% is default
ALERT=90
# Exclude list of unwanted monitoring, if several partions then use "|" to separate the partitions.
# An example: EXCLUDE_LIST="/dev/hdd1|/dev/hdc5"
EXCLUDE_LIST="/auto/ripper|loop|udev"
#
#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
#
main_prog() {
while read -r output;
do
  servername=$(hostname)
  usep=$(echo "$output" | awk '{ print $1}' | cut -d'%' -f1)
  partition=$(echo "$output" | awk '{print $2}')
  echo "partition $partition is at $usep% useage"
  if [ $usep -ge $ALERT ] ; then
     echo "Running out of space \"$partition ($usep%)\" on server $(hostname), $(date)" | \
     mail -s "***ALERT*** $servername is almost out of disk space: $usep% on $partition" "$ADMIN"
  fi
done
}

if [ "$EXCLUDE_LIST" != "" ] ; then
  df -h | grep -vE "^Filesystem|tmpfs|cdrom|${EXCLUDE_LIST}" | awk '{print $5 " " $6}' | main_prog
else
  df -h | grep -vE "^Filesystem|tmpfs|cdrom" | awk '{print $5 " " $6}' | main_prog
fi
  • now create a systemd service in /etc/systemd/system/diskmonitor.service
[Unit]
Description=Check filesystems and alert when approaching full

[Service]
Type=oneshot
ExecStart=/home/borf/disk-monitor.sh
  • then create a systemd timer /etc/systemd/system/diskmonitor.timer - this will run once per hour
[Unit]
Description=Check disks are not about to be full

[Timer]
OnBootSec=15m
OnUnitActiveSec=1h

[Install]
WantedBy=diskmonitor.target
  • enable the timer
sudo systemctl enable diskmonitor.timer