diy

Latest posts related to :

How to clone a server using just rsync

Mon 02 December 2019 | comments?
In the past I needed more space in the server and so i had to upgraded it to a more expensive option, without option of going back

Now the basic server option is cheaper and is enough for me. Plus there were some black friday discounts :)

So I decided to move the server with all my services to a cheaper option and save 75% of what i was spending with more or less the same features

Unfortunately, this is not supported by default and theres no one button way to do it. Fortunately, this is very easy to do using linux!

This is how i did it in 6 easy steps:

Step 1
- Reboot booh machines using a live image and have a working ssh server on the target server
- Mount the server disk on both servers on /mnt
Step 2
- rsync -AHXavP --numeric-ids --exclude='/mnt/dev' --exclude='/mnt/proc' --exclude='/mnt/sys' /mnt/ root@ip.dest.server:/mnt/
Step 3
- ssh on the target server. Bind /proc /dev /sys to /mnt/ and chroot it
- grub-install /dev/sdb && update-grub
- ack ip.orig.server /etc/ and change it where appropiate
- reboot
Step 4
- Change DNS
Step 5
- ????
Step 6
- Profit!
Conclusion
```
A couple of hours to do the whole thing including buying the new server and everything seems to be working as if nothing happened. Copying directly from server to server helped with the downtime too. Aint linux wonderful?
```
Get a nearly fresh debian install without reinstalling

Thu 26 September 2019 | comments?
I was recently asked how to get rid of the old and unused packages without having to reinstall?

Debian have the mechanisms to deal with this and more. Unfortunately for new people, its not as automated and a little more obscure that i would like

Anyway, heres what i would do:
```
# apt-mark showmanual
# apt-mark auto <packages you dont recognize>
# apt purge <packages you recognize but dont want anymore>
# apt autoremove --purge
```

Automating the creation of an static website

tl;dr:

python can turn tedious work into free time!

New website about hiking in Extremadura, Spain: extremaruta.es

PHP websites were on the rise a few years ago, mainly due to the raise of easy CMS like drupal and joomla. Their main problem is that they carry a high maintenance cost with them compared with an static website. You have to keep them up to date and theres new exploits every other week

I was presented with this PHP website that had been hacked very long ago and it had to be taken down because there was no way to clean it up and there was no clean copy anywhere. The only reason they were using a PHP website was that it was “easy” upfront but they never really think it throught and they didnt really needed anything dynamic, like users

One of the perks of static websites is that they are virtually impossible to hack and in case they are (probably because something else has been hacked and it gets affected), you can have it up again somewhere else in a matter of minutes

So off we go to turn the original data into a website. I chose my prefered static website generator, pelican, and then wrote a few python scripts that mostly spew markdown (so no not even pelican specific generator!)

It scans a directory with photos, .gpx and .pdf and generates the markdown and figure out where they belong and whats part of the website by the name of the files

The major challenge was to reduce times because theres almost 10Gb of data that have to processed and it would had been very tedious to debug otherwise. Thumbnails have to get generated, watermarks added, decide if something new has been added on the original data, etc… Anything done, has to undergo through 10Gb of data

"""
process.py

    Move files around and triggers the different proccesses
    in the order it needs to run, both for testing and for production
"""

#!/usr/bin/python3

import routes
from shutil import move
from subprocess import run
from os.path import join, exists


def sync_files():
    orig = join(routes.OUTPUT, "")
    dest = join(routes.FINAL_OUTPUT, "")
    linkdest = join("..", "..", orig)
    command = ("rsync", "-ah", "--delete",
               "--link-dest={}".format(linkdest), orig, dest)
    reallinkdest = join(dest, linkdest)
    if(exists(reallinkdest)):
        #print("{} exists".format(reallinkdest))
        run(command)
    else:
        print("{} doesnt exist".format(reallinkdest))
        print("its very likely the command is wrong:\n{}".format(command))
        exit(1)


def test_run():
    f = '.files_cache_files'
    if(exists(f)):
        #move(f, 'todelete')
        pass
    r = routes.Routes("real.files")
    # print(r)
    r.move_files()
    r.generate_markdown()

    sync_files()


def final_run():
    r = routes.Routes("/media/usb/web/")
    # print(routes)
    r.move_files()
    r.generate_markdown()

    sync_files()


test_run()
# final_run()

#!/usr/bin/python3
"""
routes.py

    Generate the different information and intermediate cache files so it doesnt
    have to process everything every time
"""

try:
    from slugify import slugify
except ImportError as e:
    print(e)
    print("Missing module. Please install python3-slugify")
    exit()

from pprint import pformat
from shutil import copy
from os.path import join, exists, basename, splitext
import os
import re
import json

# original files path
ORIG_BASE = "/media/usb0/web/"
ORIG_BASE = "files"
ORIG_BASE = "real.files"
# relative dest to write content
OUTPUT = join("content", "auto", "")
# relative dest pdf and gpx
STATIC = join("static", "")
FULL_STATIC = join("auto", "static", "")
# relative photos dest
PHOTOS = join("photos", "")
# relative markdown dest
PAGES = join("rutas", "")
# relative banner dest
BANNER = join(PHOTOS, "banner", "")
# absolute dests
BASE_PAGES = join(OUTPUT, PAGES, "")
BASE_STATIC = join(OUTPUT, STATIC, "")
BASE_PHOTOS = join(OUTPUT, PHOTOS, "")
BASE_BANNER = join(OUTPUT, BANNER, "")

TAGS = 'tags.txt'

# Where to copy everything once its generated
FINAL_OUTPUT = join("web", OUTPUT)

def hard_link(src, dst):
    """Tries to hard link and copy it instead where it fails"""
    try:
        os.link(src, dst)
    except OSError:
        copy(src, dst)

def sanitize_name(fpath):
    """ returns sane file names: '/á/b/c áD.dS' -> c-ad.ds"""
    fname = basename(fpath)
    split_fname = splitext(fname)
    name = slugify(split_fname[0])
    ext = slugify(split_fname[1]).lower()
    return ".".join((name, ext))

class Routes():
    pdf_re = re.compile(r".*/R(\d{1,2}).*(?:PDF|pdf)$")
    gpx_re = re.compile(r".*/R(\d{1,2}).*(?:GPX|gpx)$")
    jpg_re = re.compile(r".*/\d{1,2}R(\d{1,2}).*(?:jpg|JPG)$")
    banner_re = re.compile(r".*BANNER/Etiquetadas/.*(?:jpg|JPG)$")

    path_re = re.compile(r".*PROVINCIA DE (.*)/\d* (.*)\ (?:CC|BA)/.*")

    def __getitem__(self, item):
        return self.__routes__[item]

    def __iter__(self):
        return iter(self.__routes__)

    def __str__(self):
        return pformat(self.__routes__)

    def __init__(self, path):
        self.__routes__ = {}
        self.__files__ = {}

        self.fcache = ".files_cache_" + slugify(path)

        if(exists(self.fcache)):
            print(f"Using cache to read. {self.fcache} detected:")
            self._read_files_cache()
        else:
            print(f"No cache detected. Reading from {path}")
            self._read_files_to_cache(path)

    def _init_dir(self, path, create_ruta_dirs=True):
        """ create dir estructure. Returns True if it had to create"""
        created = True

        if(exists(path)):
            print(f"{path} exist. No need to create dirs")
            created = False
        else:
            print(f"{path} doesnt exist. Creating dirs")
            os.makedirs(path)
            if(create_ruta_dirs):
                self._create_ruta_dirs(path)

        return created

    def _create_ruta_dirs(self, path):
        """Create structure of directories in <path>"""
        for prov in self.__routes__:
            prov_path = join(path, slugify(prov))
            if(not exists(prov_path)):
                os.makedirs(prov_path)
            for comar in self.__routes__[prov]:
                comar_path = join(prov_path, slugify(comar))
                if(not exists(comar_path)):
                    os.makedirs(comar_path)
                # Special case for BASE_PAGES. Dont make last ruta folder
                if(path != BASE_PAGES):
                    for ruta in self.__routes__[prov].get(comar):
                        ruta_path = join(comar_path, ruta)
                        if(not exists(ruta_path)):
                            os.makedirs(ruta_path)

    def _read_files_cache(self):
        with open(self.fcache) as f:
            temp = json.load(f)
        self.__routes__ = temp['routes']
        self.__files__ = temp['files']

    def _read_files_to_cache(self, path):
        """read files from path into memory. Also writes the cache file"""
        """also read tags"""
        for root, subdirs, files in os.walk(path):
            for f in files:

                def append_ruta_var(match, var_name):
                    prov, comar = self._get_prov_comar(root)
                    ruta = match.group(1).zfill(2)
                    var_path = join(root, f)
                    r = self._get_ruta(prov, comar, ruta)
                    r.update({var_name: var_path})

                def append_ruta_pic(match):
                    prov, comar = self._get_prov_comar(root)
                    ruta = match.group(1).zfill(2)
                    pic_path = join(root, f)
                    r = self._get_ruta(prov, comar, ruta)
                    pics = r.setdefault('pics', list())
                    pics.append(pic_path)

                def pdf(m):
                    append_ruta_var(m, 'pdf_orig')

                def gpx(m):
                    append_ruta_var(m, 'gpx_orig')

                def append_banner(m):
                    pic_path = join(root, f)
                    banner = self.__files__.setdefault('banner', list())
                    banner.append(pic_path)

                regexes = (
                    (self.banner_re, append_banner),
                    (self.pdf_re, pdf),
                    (self.gpx_re, gpx),
                    (self.jpg_re, append_ruta_pic),
                )

                for reg, func in regexes:
                    try:
                        match = reg.match(join(root, f))
                        if(match):
                            func(match)
                            break
                        # else:
                        #    print(f"no match for {root}/{f}")
                    except Exception:
                        print(f"Not sure how to parse this file: {f}")
                        print(f"r: {root}\ns: {subdirs}\nf: {files}\n\n")

        self._read_tags()

        temp = dict({'routes': self.__routes__, 'files': self.__files__})
        with open(self.fcache, "w") as f:
            json.dump(temp, f)

    def _read_tags(self):
        with open(TAGS) as f:
            for line in f.readlines():
                try:
                    ruta, short_name, long_name, tags = [
                        p.strip() for p in line.split(":")]
                    prov, comar, number, _ = ruta.split("/")
                    r = self._get_ruta(prov, comar, number)
                    r.update({'short': short_name})
                    r.update({'long': long_name})
                    final_tags = list()
                    for t in tags.split(","):
                        final_tags.append(t)
                    r.update({'tags': final_tags})
                except ValueError:
                    pass

    def _get_prov_comar(self, path):
        pathm = self.path_re.match(path)
        prov = pathm.group(1)
        comar = pathm.group(2)

        return prov, comar

    def _get_ruta(self, prov, comar, ruta):
        """creates the intermeidate dics if needed"""

        prov = slugify(prov)
        comar = slugify(comar)

        p = self.__routes__.get(prov)
        if(not p):
            self.__routes__.update({prov: {}})

        c = self.__routes__.get(prov).get(comar)
        if(not c):
            self.__routes__.get(prov).update({comar: {}})

        r = self.__routes__.get(prov).get(comar).get(ruta)
        if(not r):
            self.__routes__.get(prov).get(comar).update({ruta: {}})

        r = self.__routes__.get(prov).get(comar).get(ruta)
        return r

    def move_files(self):
        """move misc (banner) and ruta related files (not markdown)"""
        """from dir to OUTPUT"""
        self._move_ruta_files()
        # misc have to be moved after ruta files, because the folder
        # inside photos prevents ruta photos to be moved
        self._move_misc_files()

    def _move_misc_files(self):
        if (self._init_dir(BASE_BANNER, False)):
            print("moving banner...")

            for f in self.__files__['banner']:
                fname = basename(f)
                dest = slugify(basename(f))
                hard_link(f, join(BASE_BANNER, sanitize_name(f)))

    def _move_ruta_files(self):
        """move everything ruta related: static and photos(not markdown)"""
        create_static = False
        create_photos = False

        if (self._init_dir(BASE_STATIC)):
            print("moving static...")
            create_static = True

        if (self._init_dir(BASE_PHOTOS)):
            print("moving photos...")
            create_photos = True

        for prov in self.__routes__:
            for comar in self.__routes__[prov]:
                for ruta in self.__routes__[prov].get(comar):
                    r = self.__routes__[prov].get(comar).get(ruta)
                    fbase_static = join(
                        BASE_STATIC, prov, slugify(comar), ruta)
                    fbase_photos = join(
                        BASE_PHOTOS, prov, slugify(comar), ruta)

                    def move_file(orig, dest):
                        whereto = join(dest, sanitize_name(orig))
                        hard_link(orig, whereto)

                    if(create_static):
                        for fkey in ("pdf_orig", "gpx_orig"):
                            if(fkey in r):
                                move_file(r[fkey], fbase_static)

                    if(create_photos and ("pics") in r):
                        for pic in r["pics"]:
                            move_file(pic, fbase_photos)

    def generate_markdown(self):
        """Create markdown in the correct directory"""
        self._init_dir(BASE_PAGES)
        for prov in self.__routes__:
            for comar in self.__routes__[prov]:
                for ruta in self.__routes__[prov].get(comar):
                    r = self.__routes__[prov].get(comar).get(ruta)
                    pages_base = join(
                        BASE_PAGES, prov, slugify(comar))
                    fpath = join(pages_base, f"{ruta}.md")

                    photos_base = join(prov, slugify(comar), ruta)
                    static_base = join(
                        FULL_STATIC, prov, slugify(comar), ruta)

                    with open(fpath, "w") as f:
                        title = "Title: "
                        if('long' in r):
                            title += r['long']
                        else:
                            title += f"{prov} - {comar} - Ruta {ruta}"
                        f.write(title + "\n")
                        f.write(f"Path: {ruta}\n")
                        f.write("Date: 2018-01-01 00:00\n")
                        if('tags' in r):
                            f.write("Tags: {}".format(", ".join(r['tags'])))
                            f.write("\n")
                        f.write("Gallery: {photo}")
                        f.write(f"{photos_base}\n")

                        try:
                            fpath = join("/", static_base, sanitize_name(r['pdf_orig']))
                            f.write( f'Pdf: {fpath}\n')
                        except KeyError:
                            f.write('Esta ruta no tiene descripcion (pdf)\n\n')


                        try:
                            fpath = join("/", static_base, sanitize_name(r['gpx_orig']))
                            f.write(f"Gpx: {fpath}\n")
                        except KeyError:
                            f.write('Esta ruta no tiene coordenadas (gpx)\n\n')


                        if('pics' not in r):
                            f.write('Esta ruta no tiene fotos\n\n')



if __name__ == "__main__":
    routes = Routes(ORIG_BASE)
    # print(routes)
    print("done reading")
    routes.move_files()
    routes.generate_markdown()
    print("done writing")

New web Design

Wed 21 November 2018 | comments?

I hope you like it. I remade the theme from scratch using css grid to make the web responsive. Responsive means that it adapts to the screen size, so it works both on large screens and on phones

I tried to keep the old look as much as possible. So in the worst case scenario, you wont notice any change :)

Web technologies have really go a long way. It used to be a nightmare. Nowadays, dare I say that is a pleasure and fun to make your own design
You have no repos online!

Wed 14 February 2018 | comments?

Git is life. I manage my own git server in git.alberto.tf. But its now set private to only committers

The main reason is that, more often than not, im the only committer. That allows me to be carefree about using time based commits (as opposed to features-based commits as god intended) when I need it. For example to move my work from one computer to another, etc…

Im also less careful about clumping together typos, features and fixes on the same commits

Just by having many time-based commits make the repos themselves to not add much to the code I show otherwise. So I decided to not expose the repos by default. You are still free to use any code you find on this site

The other reason is metadata

Avoid awkward faces when I have to explain a commit at 4am. – “I thought you were sick!” :)
Get updates from this blog!

Wed 26 April 2017 | comments?

Have you ever seen this icon and wonder what it is?

You can see this refered as RSS, atom or just feed of a website. It all refers to the same thing: get updates from this site without having to create an account

Its likely you never heard of this tech because it has been supressed and practically been eliminated from all mainstream webs despite being simple and functional. Google had one of the best RSS readers and they closed it

The reason for this is that they want to be your only stop for information. You might not think much about it but your attention is the most valuable asset right now… for you and for them. They can try to push information (propaganda or ads) on you like you are some kind of kid who doesnt know better

Luckily, this tech is so simple that is likely never going to die as long as a single reader and a single feed poster exists on the planet :)

This is an example of what a feed look like and this is list with all my feeds

Theres a plethora of offline readers out there both for computer and cellphone and you only have to copy the feed link into it. Ask me if you have troubles making it work or finding one that you like

I host my own copy of tt-rss, an online reader, and it allows multiple users. If you know me, just ask me for an user and I will happily create one for you so you dont need to host your own (an online reader allows to sync all of your devices for example)

Automating the extraction of duplicated zip files

Its not that well-known that a zip file does not save a directory inside. It saves a secuence of files, and nothing prevents those files names to be duplicated inside a file

All the tools Ive checked out overwrite silently the duplicates or allow you to manually rename them. Which is very tedious as soon as you have to do this a few times with lots of duplicated files

I had to bake my own solution using python. If you know about a tool that does this, please let me know. I love to deprecate my own solutions :)

unzip_rename_dups.py

#!/usr/bin/env python3
import pdb
import sys
import zipfile
from os.path import splitext, dirname, abspath, join
from os import rename


ZIP = sys.argv[1]
DIR = dirname(abspath(ZIP))

filenames = {}
extracted = 0
dups = 0

with zipfile.ZipFile(ZIP) as z:

    for info in z.infolist():
        z.extract(info, DIR)
        extracted += 1

        fn = info.filename

        if fn not in filenames:
            filenames[fn] = 1
        else:
            filenames[fn] += 1
            dups += 1

        orig_path = join(DIR, fn)

        preext, postext = splitext(fn)
        final_fn = preext + str(filenames[fn]) + postext
        final_path = join(DIR, final_fn)

        rename(orig_path, final_path)

print("{} files extracted sucesfully. {} Duplicated files saved!".format(extracted, dups))

Automatize wildcard cert renewal

problem definition

I host one instance of sandstorm. Id like to use my own domain AND HTTPS

Sandstorm uses a new unguessable throw-away host name for every session as part of its security strategy, so in order to host your own under your own domain, you need a wildcard DNS entry and a wildcard cert for it (a cert with a *.yourdomain that will be valid for all your subdomains)

I use certbot (aka letsencrypt) to generate my certificates. Unfortunately, they have stated that will not emit wildcard certificates. Not now, and very likely, not in the future

Sandstorm offers a free DNS service using sandcats.io with batteries included (free wildcard cert). But this makes the whole site looks like they are not running under your control when you share a link to it to a third party (even tho is not true). This being one of the main points of running my own instance makes this solution not suitable for me

For reasons that deserver its own rant, I will not buy a wildcard cert

This only left me with the option of running sandstorm in a local port, have my apache proxy petitions and present the right certs. I will be using the sandcats.io DNS + wilcard cert for websockets, which are virtually invisible to the final user

The certbot cert renovation is easy enough to automate, but I need to automate the renewal of the sandcats.io cert, which lasts for 9 days

solution

A service will run weekly to renew the cert. For this, It will use a configuration faking using one of those free sandcats.io free certs so sandstorm renew the cert. Parse the new cert and tell apache to use it

shortcomings

Disclaimer: This setup is not officially supported by sandstorm

The reason is that some apps doesnt work well due to some browsers security policies. Just like sandstorm guys, I had to make a compromise. The stuff I use works for me and I have to test it before I use something new :)

code

updatecert.py

#!/usr/bin/env python3
import json
from subprocess import call,check_call
from glob import glob
from shutil import copy
from time import sleep
from timeout import timeout

TIMEOUT = 120

SSPATH = '/opt/sandstorm'
CONF = SSPATH + '/sandstorm.conf'
GOODCONF = SSPATH + '/sandstorm.good.conf'
CERTCONF = SSPATH + '/sandstorm.certs.conf'
CERTSPATH = SSPATH + '/var/sandcats/https/server.sandcats.io/'
APACHECERT = '/etc/apache2/tls/cert'
APACHECERTPUB = APACHECERT + '.crt'
APACHECERTKEY = APACHECERT + '.key'

RESTART_APACHE_CMD = 'systemctl restart apache2'.split()
RESTART_SS_CMD = 'systemctl restart sandstorm'.split()

@timeout(TIMEOUT, "ERROR: Cert didnt renew in {} secs".format(TIMEOUT))
def check_cert_reply(files_before):
    found = None
    print("waiting for new cert in" + CERTCONF, end="")
    while not found:
        print(".", end="", flush=True)
        sleep(5)
        files_after = set(glob(CERTSPATH + '*.response-json'))

        found = files_after - files_before
    else:
        print("")
    return found.pop()

def renew_cert():
    files_before = set(glob(CERTSPATH + '*.response-json'))
    copy(CERTCONF, CONF)
    call(RESTART_SS_CMD)
    try:
        new_cert = check_cert_reply(files_before)
    finally:
        print("Restoring sandstorm conf and restarting it")
        copy(GOODCONF, CONF)
        call(RESTART_SS_CMD)
        print("Restoring done")
    return new_cert

def parse_cert(certfile):
    with open(certfile) as f:
        certs = json.load(f)

    with open(APACHECERTPUB, 'w') as cert:

        cert.write(certs['cert'])

        ca = certs['ca']
        ca.reverse()
        for i in ca:
            cert.write('\n')
            cert.write(i)

    copy(certfile[:-len('.response-json')], APACHECERTKEY)

if __name__ == '__main__':
    new_cert = renew_cert()
    parse_cert(new_cert)
    try:
        check_call(RESTART_APACHE_CMD)
    except:
        # one reason for apache to fail is to try to parse the json before is completely written
        # try once again just in case
        print("failed to restart apache with the new cert. Trying once more")
        sleep(1)
        parse_cert(new_cert)
        call(RESTART_APACHE_CMD)

updatecert.service

[Unit]
Description=tries to renew ss cert
OnFailure=status-email-admin@%n.service

[Service]
Type=oneshot
ExecStart=/root/updatecert.py

updatecert.timer

[Unit]
Description=runs ss cert renewal once a week

[Timer]
Persistent=true
OnCalendar=weekly
Unit=updatecert.service

[Install]
WantedBy=default.target

Look at that nice looking FreedomBox!

Fri 22 April 2016 | comments?

I’m rebuilding my home server and decided to take a look at freedombox project as the base for it

0.6 version was recently released and I wasnt aware of how advanced the project is already!

They have a virtualbox image ready for some quick test. It took me longer to download it than to start using it

Here’s a pic of what it looks like to entice you to try it :)

All this is already on debian right now and you can turn any debian sid installation into a freedombox just by installing a package

The setup generates everything private on the first run, so even the virtualbox image can be used as the final thing

They use plinth (django) to integrate the applications into the web interface. More info on how to help integrate more debian packages here

A live demo is going to be streamed this friday and a hackaton is scheduled for this saturday

Cheers!

Original post at Laura Arjona’s Blog on 30 October 2015. Thanks for first hosting it!

diy

Latest posts related to :

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Conclusion

problem definition

solution

shortcomings

code