the f*ck rants about stuff

Backup fixes!

A year ago I made an automatization solution for a backup. Very basic approach but it got the job done

It started to fail randomly, so I had to take a look. I fixed it and took the oportunity to add a few features while debugging it

Overall improved resilience. Now it can recover from most errors and inform properly when it can not

Changelog:

  • FIX: Backup file geting corrupted on email transit. It seems google was mangling .gpg files
  • FIX: Add clean up section to ensure the resources are consumed. Systemd.path works like a spool. Also needs to sync at the end because systemd relaunch the file as soon as is done. The OS didnt even have time to write to disk
  • FIX: Clean up service on restart that auto remove mail lock created and never removed if computer loses power in the middle of the sending
  • FIX: Systemd.path starts processing as soon as the path is found. I had to ensure the file was done written before processing it
  • FIX: Systemd forking instead of oneshot. I was leaving the process ligering for the pop up windows to finish. This is what Type=forking does

  • FEAT: Checksums included in the backup to be able to auto verify integrity when recovering and be able to properly fail when the IN and OUT files are different

  • FEAT: Add proper systemd logging. Including checksums
  • FEAT: Show POP-UPs to the final users showing star/stop of the service and notifiying them of errors
  • FEAT: Add arguments to ease local debugging including --quiet option added for debugging remotely without showing POP UPS

No repo! but heres the code so you take a peak or reuse it. POP-UPS are in spanish

code
backup.py

#!/usr/bin/env python3

from datetime import datetime, timedelta
from os import path, remove, fork, _exit, environ
from subprocess import run, CalledProcessError
from sys import exit, version_info
from systemd import journal
from hashlib import md5
import argparse


def display_alert(text, wtype="info"):
    journal.send("display: {}".format(text.replace("\n", " - ")))
    if(not args.quiet):
        if(not fork()):
            env = environ.copy()
            env.update({'DISPLAY': ':0.0', 'XAUTHORITY':
                        '/home/{}/.Xauthority'.format(USER)})
            zenity_cmd = [
                'zenity', '--text={}'.format(text), '--no-markup', '--{}'.format(wtype), '--no-wrap']
            run(zenity_cmd, env=env)
            # let the main thread do the clean up
            _exit(0)


def md5sum(fname):
    cs = md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            cs.update(chunk)
    return cs.hexdigest()


# Args Parser init
parser = argparse.ArgumentParser()
parser.add_argument(
    "-q", "--quiet", help="dont show pop ups", action="store_true")
parser.add_argument("-u", "--user", help="user to display the dialogs as")
parser.add_argument("-p", "--path", help="path of the file to backup")
parser.add_argument("-t", "--to", help="who to send the email")
parser.add_argument(
    "-k", "--keep", help="keep output file", action="store_true")
parser.add_argument(
    "-n", "--no-mail", help="dont try to send the mail", action="store_true")
args = parser.parse_args()

# Globals
USER = 'company'
if(args.user):
    USER = args.user
    journal.send("USER OVERWRITE: {}".format(USER))

TO = "info@company.com"
if(args.to):
    TO = args.to
    journal.send("EMAIL TO OVERWRITE: {}".format(TO))
BODY = "mail.body"
FILENAME = 'database.mdb'
PATH = '/home/company/shared'
if(args.path):
    PATH = args.path
    journal.send("PATH OVERWRITE: {}".format(PATH))

if(args.quiet):
    journal.send("QUIET NO-POPUPS mode")

FILE = path.join(PATH, FILENAME)
FILEXZ = FILE + ".tar.xz"
now = datetime.now()
OUTPUT = path.join(PATH, 'backup_{:%Y%m%d_%H%M%S}.backup'.format(now))
CHECKSUM_FILE = FILENAME + ".checksum"

error_msg_tail = "Ejecuta $ journalctl -u backup.service para saber más"

LSOF_CMD = ["fuser", FILE]
XZ_CMD = ["tar", "-cJC", PATH, "-f", FILEXZ, FILENAME, CHECKSUM_FILE]
GPG_CMD = ["gpg", "-q", "--batch", "--yes", "-e", "-r", "backup", "-o", OUTPUT, FILEXZ]

error = ""


# Main
display_alert('Empezando la copia de seguridad: {:%Y-%m-%d %H:%M:%S}\n\n'
              'NO apagues el ordenador todavia por favor'.format(now))

# sanity file exists
if(path.exists(FILE)):
    journal.send(
        "New file {} detected. Trying to generate {}".format(FILE, OUTPUT))
else:
    exit("{} not found. Aborting".format(FILE))

# make sure file finished being copied
finished_copy = False
while(not finished_copy):
    try:
        run(LSOF_CMD, check=True)
        journal.send(
            "File is still open somewhere. Waiting 1 extra second before processing")
        run("sleep 1".split())
    except CalledProcessError:
        finished_copy = True
    except Exception as e:
        display_alert(
            "ERROR\n{}\n\n{}".format(e, error_msg_tail), "error")
        exit(0)

filedate = datetime.fromtimestamp(path.getmtime(FILE))

# sanity date
if(now - timedelta(hours=1) > filedate):
    error = """El fichero que estas mandando se creó hace más de una hora.
fecha del fichero: {:%Y-%m-%d %H:%M:%S}
fecha actual     : {:%Y-%m-%d %H:%M:%S}

Comprueba que es el correcto
""".format(filedate, now)

# Generate checksum file
csum = md5sum(FILE)
journal.send(".mdb md5: {} {}".format(csum, FILENAME))

with open(CHECKSUM_FILE, "w") as f:
    f.write(csum)
    f.write(" ")
    f.write(FILENAME)

# Compress
if(path.isfile(FILEXZ)):
    remove(FILEXZ)

journal.send("running XZ_CMD: {}".format(" ".join(XZ_CMD)))
run(XZ_CMD)
csum = md5sum(FILEXZ)
journal.send(".tar.xz md5: {} {}".format(csum, FILEXZ))

# encrypt
journal.send("running GPG_CMD: {}".format(" ".join(GPG_CMD)))
run(GPG_CMD)
csum = md5sum(OUTPUT)
journal.send(".gpg md5: {} {}".format(csum, OUTPUT))

remove(FILEXZ)

# sanity size
filesize = path.getsize(OUTPUT)
if(filesize < 5000000):
    error += """"El fichero que estas mandando es menor de 5Mb
tamaño del fichero en bytes: ({})

Comprueba que es el correcto
""".format(filesize)

subjectstr = "Backup {}ok con fecha {:%Y-%m-%d %H:%M:%S}"
subject = subjectstr.format("NO " if error else "", now)
body = """Todo parece okay, pero no olvides comprobar que
el fichero salvado funciona bien por tu cuenta!
"""
if(error):
    body = error

with open(BODY, "w") as f:
    f.write(body)

journal.send("{} generated correctly".format(OUTPUT))
try:
    if(not args.no_mail):
        journal.send("Trying to send it to {}".format(TO))
        MAIL_CMD = ["mutt", "-a", OUTPUT, "-s", subject, "--", TO]

        if(version_info.minor < 6):
            run(MAIL_CMD, input=body, universal_newlines=True, check=True)
        else:
            run(MAIL_CMD, input=body, encoding="utf-8", check=True)
except Exception as e:
    display_alert(
        "ERROR al enviar el backup por correo:\n{}".format(e), "error")
else:
    later = datetime.now()
    took = later.replace(microsecond=0) - now.replace(microsecond=0)
    display_alert('Copia finalizada: {:%Y-%m-%d %H:%M:%S}\n'
                  'Ha tardado: {}\n\n'
                  'Ya puedes apagar el ordenador'.format(later, took))

finally:
    if(not args.keep and path.exists(OUTPUT)):
        journal.send("removing gpg:{}".format(OUTPUT))
        remove(OUTPUT)
unbackup.py
#!/usr/bin/env python3

from os import path, remove, sync, fork, _exit, environ
from subprocess import run, CalledProcessError
from glob import glob
from sys import exit
from systemd import journal
from hashlib import md5
import argparse


def display_alert(text, wtype="info"):
    if(not args.quiet):
        if(not fork()):
            env = environ.copy()
            env.update({'DISPLAY': ':0.0', 'XAUTHORITY':
                        '/home/{}/.Xauthority'.format(USER)})
            zenity_cmd = [
                'zenity', '--text={}'.format(text), '--no-markup', '--{}'.format(wtype), '--no-wrap']
            run(zenity_cmd, env=env)
            # Let the main thread do the clean up
            _exit(0)


def md5sum(fname):
    cs = md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            cs.update(chunk)
    return cs.hexdigest()


# Args Parser init
parser = argparse.ArgumentParser()
parser.add_argument(
    "-q", "--quiet", help="dont show pop ups", action="store_true")
parser.add_argument("-u", "--user", help="user to display the dialogs as")
parser.add_argument("-p", "--path", help="path of the file to unbackup")
parser.add_argument(
    "-k", "--keep", help="keep original file", action="store_true")
args = parser.parse_args()

# Globals
USER = 'company'
if(args.user):
    USER = args.user
    journal.send("USER OVERWRITE: {}".format(USER))

PATH = '/home/rk/shared'
if(args.path):
    PATH = args.path
    journal.send("PATH OVERWRITE: {}".format(PATH))

if(args.quiet):
    journal.send("QUIET NO-POPUPS mode")

OUTPUT_FILE = 'database.mdb'
error_msg_tail = "Ejecuta $ journalctl -u unbackup.service para saber más"
CHECKSUM_FILE = OUTPUT_FILE + ".checksum"


# Main
try:
    input_file = glob(path.join(PATH, 'backup*.backup'))[0]
except IndexError as e:
    display_alert("ERROR\nEl fichero de backup no existe:\n{}\n\n{}".format(
        e, error_msg_tail), "error")
    exit(0)
except Exception as e:
    display_alert(
        "ERROR\n{}\n{}".format(e, error_msg_tail), "error")
    exit(0)
else:
    display_alert(
        "Se ha detectado {}. Empiezo a procesarlo".format(input_file))

    output_path = path.join(PATH, OUTPUT_FILE)
    output_pathxz = output_path + ".tar.xz"

    LSOF_CMD = ["fuser", input_file]
    GPG_CMD = ["gpg", "--batch", "-qdo", output_pathxz, input_file]
    XZ_CMD = ["tar", "-xf", output_pathxz]

# make sure file finished being copied. Systemd triggers this script as soon as the file name shows
try:
    finished_copy = False
    while(not finished_copy):
        try:
            run(LSOF_CMD, check=True)
            journal.send(
                "File is still open somewhere. Waiting 1 extra second before processing")
            run("sleep 1".split())
        except CalledProcessError:
            finished_copy = True
        except Exception as e:
            display_alert(
                "ERROR\n{}\n\n{}".format(e, error_msg_tail), "error")
            exit(0)

    csum = md5sum(input_file)
    journal.send(".gpg md5: {} {}".format(csum, input_file))

    if(path.exists(output_pathxz)):
        journal.send("{} detected. Removing".format(output_pathxz))
        remove(output_pathxz)

    journal.send("running GPG_CMD: {}".format(" ".join(GPG_CMD)))
    run(GPG_CMD, check=True)

    csum = md5sum(output_pathxz)
    journal.send("tar.xz md5: {} {}".format(csum, input_file))

    journal.send("running XZ_CMD: {}".format(" ".join(XZ_CMD)))
    run(XZ_CMD, check=True)

# Check Checksum
    with open(CHECKSUM_FILE) as f:
        target_cs, filename = f.read().strip().split()
    actual_cs = md5sum(filename)
    journal.send(".mdb md5: {} {}".format(actual_cs, filename))
    if(target_cs == actual_cs):
        journal.send("El checksum interno final es correcto!")
    else:
        display_alert("ERROR\n"
                      "Los checksums de {} no coinciden"
                      "Que significa que el fichero esta dañado"
                      .format(filename), "error")

except Exception as e:
    display_alert("ERROR\n{}\n\n{}"
                  .format(e, error_msg_tail), "error")
    exit(0)
else:
    display_alert("{} generado con exito".format(output_path))
finally:
    if(not args.keep and path.exists(input_file)):
        journal.send("CLEAN UP: removing gpg {}".format(input_file))
        # make sure the file is not open before trying to remove it
        sync()
        remove(input_file)
        # sync so systemd dont detect the file again after finishing the script
        sync()
backup.path 
[Unit]
Description=Carpeta Compartida backup

[Path]
PathChanged=/home/company/shared/database.mdb
Unit=backup.service

[Install]
WantedBy=multi-user.target
backup.service
[Unit]
Description=backup service

[Service]
Type=forking
ExecStart=/root/backup/backup.py
TimeoutSec=600
unbackup.path
[Unit]
Description=Unbackup shared folder

[Path]
PathExistsGlob=/home/company/shared/backup*.backup
Unit=unbackup.service

[Install]
WantedBy=multi-user.target
unbackup.service
[Unit]
Description=Unbackup service
[Service]
Type=forking
Environment=DISPLAY=:0.0
ExecStart=/root/company/unbackup.py
comments?

If you liked this, I think you might be interested in some of these related articles:

¡ En Español !