FilesystemMonitoring: Difference between revisions

From Run Your Own
Jump to navigation Jump to search
(create page documenting a filesystem fill-up email alarm)
 
mNo edit summary
Line 1: Line 1:
# Filesystem Monitoring #
= Filesystem Monitoring =


If you have applications with a propensity to blow up it can be helpful to have an alert before you run out of disk space.
If you have applications with a propensity to blow up it can be helpful to have an alert before you run out of disk space.
Line 70: Line 70:


* enable the timer
* enable the timer
sudo systemctl enable diskmonitor.timer
sudo systemctl enable diskmonitor.timer

Revision as of 15:06, 19 February 2025

Filesystem Monitoring

If you have applications with a propensity to blow up it can be helpful to have an alert before you run out of disk space.

Thinks like mastodon and certain file-sharing systems can steadily eat up lots of space caching things that you might not need any more. There also the always present danger that logs or backup files can inflate to huge sizes due to unobserved errors or missing clean-ups. So let's make a little script that will send us an alarm email if it looks dicey:

  • save this script somewhere useful like disk-monitor.sh in your private bin folder
#!/bin/bash
# set -x
# Shell script to monitor or watch the disk space
# It will send an email to $ADMIN, if the (free available) percentage of space is >= 90%.
# --------------------------------------------------------------------------------------------------------
# Set admin email so that you can get email.
ADMIN="admins@example.com"
# set alert level 90% is default
ALERT=90
# Exclude list of unwanted monitoring, if several partions then use "|" to separate the partitions.
# An example: EXCLUDE_LIST="/dev/hdd1|/dev/hdc5"
EXCLUDE_LIST="/auto/ripper|loop|udev"
#
#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
#
main_prog() {
while read -r output;
do
  #echo "Working on $output ..."
  servername=$(hostname)
  usep=$(echo "$output" | awk '{ print $1}' | cut -d'%' -f1)
  partition=$(echo "$output" | awk '{print $2}')
  #echo "alert level is $ALERT"
  echo "partition $partition is at $usep% useage"
  if [ $usep -ge $ALERT ] ; then
     #echo "ALERT $partition is now up to $usep percent utilization!"
     echo "Running out of space \"$partition ($usep%)\" on server $(hostname), $(date)" | \
     mail -s "***ALERT*** $servername is almost out of disk space: $usep% on $partition" "$ADMIN"
  fi
done
}

if [ "$EXCLUDE_LIST" != "" ] ; then
  df -h | grep -vE "^Filesystem|tmpfs|cdrom|${EXCLUDE_LIST}" | awk '{print $5 " " $6}' | main_prog
else
  df -h | grep -vE "^Filesystem|tmpfs|cdrom" | awk '{print $5 " " $6}' | main_prog
fi
  • now create a systemd service in /etc/systemd/system/diskmonitor.service
[Unit]
Description=Check filesystems and alert when approaching full

[Service]
Type=oneshot
ExecStart=/home/borf/disk-monitor.sh
  • then create a systemd timer /etc/systemd/system/diskmonitor.timer - this will run once per hour
[Unit]
Description=Check disks are not about to be full

[Timer]
OnBootSec=15m
OnUnitActiveSec=1h

[Install]
WantedBy=diskmonitor.target
  • enable the timer
sudo systemctl enable diskmonitor.timer