Backup Monitoring with Home Assistant
In a previous post, I described how I built a backup device for offsite backups. That device is actually now running for 3 years. I’d have expected the old ODROID board to make some trouble, since it is now almost 9 years old, but it is running without any issues. What is causing issues is the HDD - I had to replace it twice already. Replacing the HDD is not much of an issue, in fact, I got a warranty replacement for one of the disks. However, you have to actually notice the failure, which is not that obvious for scheduled restic run to a remote probe.
In could set up a Grafana/Telegraf stack for this and install some alerts, but this is a lot of work and I actually already have a monitoring system running, that can display nice time-series data and notify me conditionally: Home Assistant! So let’s integrate the probe there.
Adding a new MQTT Sensor in Home Assistant:
Probably the most straightforward communication interface to Home Assistant (HA) is most likely MQTT, and I’m already using it for some ESP-Home devices. In addition, it is pretty easy to send MQTT messages via the command line, using mosquitto_pub
.
Ok, so what exactly do we need to send?
Home Assistant has a device auto discovery mechanism for MQTT, meaning that whenever a certain message is posted to a certain topic, it will automatically create a device from that message’s body.
This topic follows the schema <discovery_prefix>/<component>/[<node_id>/]<object_id>/config
.
discovery_prefix
defaults to homeassistant
and a list of possible component
values hides in the device class documentation. node_id
can be ignored and as we use restic_backup
as object_id
we get homeassistant/sensor/restic_backup/config
as the discovery topic.
The payload of this message defines the device and the device we want should simply report whether a backup is currently running or not. Any additional logic such as last run can then be implemented in Home Assistant. Honestly, I found it pretty hard to understand how this payload has to look like, and it took me quite some trial-and-error, but this is how a working autodetect payload for this scenario looks like:
{
"name": "Restic Backup",
"device_class": "enum",
"state_topic": "homeassistant/sensor/restic_backup/state",
"unique_id": "restic_backup_mqtt1",
"device": {
"name": "Restic Backup",
"identifiers": [ "restic_backup" ]
}
}
name
is obvious, and as we just report a status,device_class
should beenum
.- The
state_topic
tells Home Assistant, where the device will report the status, and it can be rather arbitrary. - If you omit the
unique_id
, you can find the sensor in the Entities section, but the readings will not be listed in the according Device. - The
device
part was quite confusing to me, as this is kind of inverted logic: You don’t specify this very sensor here, but the device that this sensor gets assigned to. If you have multiple sensors with the samedevice
section, they are “grouped” together.
According to the documentation, there are other ways to write the payload, but, honestly, I had quite some trouble understanding the details. Anyway, the above configuration is good enough here.
Let’s send this to the broker and check if we have a new device now:
> mosquitto_pub -h 192.168.0.25 -p 1883 -r \
-t "homeassistant/sensor/restic_backup/config" \
-m '{ "name": "Restic Backup", "device_class": "enum",
"state_topic": "homeassistant/sensor/restic_backup2/state",
"unique_id": "restic_backup_mqtt1",
"device": { "name": "Restic Backup", "identifiers": [ "restic_backup" ] } }'
Note, that we use -r
here so that the device is persistent in Home Assistant (at least until the Broker has a restart).
Anyway, we can now also send a message to state_topic
and observe the status changing in the web UI:
Note: To remove the device from Home Assistant, simply send an empty message to the discovery topic:
> mosquitto_pub -h 192.168.0.25 -p 1883 -r \
-t "homeassistant/sensor/restic_backup/config" -m ''
Integration with restic
Ok, now we have a proof-of-concept. The next step is to combine this with the backups.
Fortunately, the restic container I’m using supports hooks for executing a script before and after the backup.
But as the container doesn’t have mosquitto
installed by default, we need to add this to the container first.
As described in the previous post, we are using a custom Dockerfile anyway, so we just have to add one line to the file:
FROM lobaro/restic-backup-docker:latest
RUN mkdir -p /root/.ssh && ln -s /run/secrets/user_ssh_key /root/.ssh/id_rsa
RUN chown -R root:root /root/.ssh
RUN printf "Host 10.13.13.5\n\tStrictHostKeyChecking no\n" >> /root/.ssh/config
# Install mosquitto
RUN apk add mosquitto-clients
Ok, now we can finally create our hooks. We create a folder hooks
and in this the two scripts pre_backup.sh
and post_backup.sh
:
They are called before/after the backup and the logic we want is quite simple:
- Publish the auto-discovery message. (It doesn’t matter if it was already published, as we don’t alter it)
- The
pre_backup.sh
script publishes Running as status. - The
post_backup.sh
script checks the exit code of restic to determine whether the backup was successful or not, and publishes Idle or Error as status.
So this is the resulting pre_backup.sh
:
#!/bin/sh
mosquitto_pub -h $MQTT_SERVER -p $MQTT_PORT -r \
-t "homeassistant/sensor/$NAME/config" \
-m "{\
\"name\": \"$NICE_NAME\", \
\"device_class\": \"enum\", \
\"unique_id\": \"$NAME-mqtt1\", \
\"state_topic\": \"homeassistant/sensor/$NAME/state\", \
\"device\": { \"name\": \"$NICE_NAME\", \"identifiers\": [ \"$NAME\" ] } \
}"
mosquitto_pub -h $MQTT_SERVER -p $MQTT_PORT \
-t "homeassistant/sensor/$NAME/state" \
-m 'Running'
And post_backup.sh
looks as follows:
#!/bin/sh
mosquitto_pub -h $MQTT_SERVER -p $MQTT_PORT -r \
-t "homeassistant/sensor/$NAME/config" \
-m "{ \
\"name\": \"$NICE_NAME\", \
\"device_class\": \"enum\", \
\"unique_id\": \"$NAME-mqtt1\", \
\"state_topic\": \"homeassistant/sensor/$NAME/state\", \
\"device\": { \"name\": \"$NICE_NAME\", \"identifiers\": [ \"$NAME\" ] } \
}"
# Check restics return code
if [ "${1:-0}" = "0" ]; then
mosquitto_pub -h $MQTT_SERVER -p $MQTT_PORT \
-t "homeassistant/sensor/$NAME/state" \
-m 'Idle'
else
mosquitto_pub -h $MQTT_SERVER -p $MQTT_PORT \
-t "homeassistant/sensor/$NAME/state" \
-m 'Error'
fi
And we shouldn’t forget to make the scripts executable:
chmod +x hooks/pre-backup.sh hooks/post-backup.sh
If you wonder where we have defined NAME
, MQTT_SERVER
, etc.: Congratulations, you’ve been paying attention.
We set them in our docker-compose.yml
!
The full compose file can be found in the first post on this setup, so here just the new bits:
---
services:
wireguard:
# ...
restic-backup:
container_name: restic_nas
build:
context: .
dockerfile: Dockerfile
environment:
- RESTIC_REPOSITORY=sftp:jonathan@10.13.13.2:/data/backup/server
# ...
- RESTIC_FORGET_ARGS=--keep-last 2 --keep-monthly 3
# The environment variables for the MQTT hook
- MQTT_SERVER=192.168.0.25
- MQTT_PORT=1883
- NAME=restic_nas
- NICE_NAME=Restic NAS
volumes:
# Mount the hook-scripts
- ./hooks:/hooks:ro
- /zstorage/git:/data/git:ro
# ...
# ...
We build the compose file and manually trigger a backup to check if it is working:
> docker-compose up --force-recreate --remove-orphans -d --build
> docker exec -ti restic-backup /bin/sh
/ > /bin/backup
Starting pre-backup
#...
Automations in Home Assistant
Ok, almost done. All that’s left is creating an automation in HA. I’m just running a simple one, and there is room for improvement, but at least I get a warning if there was no backup or the backup threw an error:
alias: Restic Backup Warning
description: ""
mode: single
triggers:
# Backup sensor is in "Error" state
- entity_id:
- sensor.restic_backup
to: Error
id: Backup Error
trigger: state
# No backup has started in more than a day
# (e.g., probe is offline, MQTT shenanigans)
- entity_id:
- sensor.restic_backup
to: Idle
for:
hours: 25
minutes: 0
seconds: 0
id: No backup
trigger: state
conditions: []
actions:
# Send a Signal message.
- data:
message: There is an issue with the backups! ()
action: notify.signal
Additional sensors
But wait - there’s more!
As we can input basically any measurements in Home Assistant, what about the status of the probe itself?
Especially the disk usage would be very interesting for the backup probe.
So I was about to tinker some shell scripts that read CPU temperature, but then I found the system_sensors
project, which had it already done, just better than what I’d have tinkered.
Ok, then let’s set it up:
> git clone https://github.com/Sennevds/system_sensors.git
> sudo apt-get install python3-dev python3-apt
> cd system_sensors && pip3 install -r requirements.txt
> cp src/settings_example.yaml src/settings.yaml
Now we adapt the MQTT settings, deviceName
and other settings to our liking, and try it out with:
python3 src/system_sensors.py src/settings.yaml
We should now see a new device with plenty of sensors in Home Assistant:
I leave it as an exercise to the reader to build some fancy automations with this data and end this post with a screenshot of my dashboard (yes, my NAS is also sending the status via MQTT).