the f*ck rants about stuff

tech

Latest posts related to :



  1. How to clone a server using just rsync

    In the past I needed more space in the server and so i had to upgraded it to a more expensive option, without option of going back

    Now the basic server option is cheaper and is enough for me. Plus there were some black friday discounts :)

    So I decided to move the server with all my services to a cheaper option and save 75% of what i was spending with more or less the same features

    Unfortunately, this is not supported by default and theres no one button way to do it. Fortunately, this is very easy to do using linux!

    People fighting over products in black friday fashion

    This is how i did it in 6 easy steps:

    Step 1

    • Reboot booh machines using a live image and have a working ssh server on the target server
    • Mount the server disk on both servers on /mnt

    Step 2

    • rsync -AHXavP --numeric-ids --exclude='/mnt/dev' --exclude='/mnt/proc' --exclude='/mnt/sys' /mnt/ root@ip.dest.server:/mnt/

    Step 3

    • ssh on the target server. Bind /proc /dev /sys to /mnt/ and chroot it
    • grub-install /dev/sdb && update-grub
    • ack ip.orig.server /etc/ and change it where appropiate
    • reboot

    Step 4

    • Change DNS

    Step 5

    • ????

    Step 6

    • Profit!
    Conclusion
    A couple of hours to do the whole thing including buying the new server and everything seems to be working as if nothing happened. Copying directly from server to server helped with the downtime too. Aint linux wonderful?
    
  2. Get a nearly fresh debian install without reinstalling

    I was recently asked how to get rid of the old and unused packages without having to reinstall?

    Debian have the mechanisms to deal with this and more. Unfortunately for new people, its not as automated and a little more obscure that i would like

    Anyway, heres what i would do:

    # apt-mark showmanual
    # apt-mark auto <packages you dont recognize>
    # apt purge <packages you recognize but dont want anymore>
    # apt autoremove --purge
    
  3. Automating the creation of an static website

    tl;dr:
    python can turn tedious work into free time!
    New website about hiking in Extremadura, Spain: extremaruta.es

    extremaruta website snapshot 1

    extremaruta website snapshot 2

    PHP websites were on the rise a few years ago, mainly due to the raise of easy CMS like drupal and joomla. Their main problem is that they carry a high maintenance cost with them compared with an static website. You have to keep them up to date and theres new exploits every other week

    I was presented with this PHP website that had been hacked very long ago and it had to be taken down because there was no way to clean it up and there was no clean copy anywhere. The only reason they were using a PHP website was that it was “easy” upfront but they never really think it throught and they didnt really needed anything dynamic, like users

    One of the perks of static websites is that they are virtually impossible to hack and in case they are (probably because something else has been hacked and it gets affected), you can have it up again somewhere else in a matter of minutes

    So off we go to turn the original data into a website. I chose my prefered static website generator, pelican, and then wrote a few python scripts that mostly spew markdown (so no not even pelican specific generator!)

    It scans a directory with photos, .gpx and .pdf and generates the markdown and figure out where they belong and whats part of the website by the name of the files

    The major challenge was to reduce times because theres almost 10Gb of data that have to processed and it would had been very tedious to debug otherwise. Thumbnails have to get generated, watermarks added, decide if something new has been added on the original data, etc… Anything done, has to undergo through 10Gb of data

    """
    process.py
    
        Move files around and triggers the different proccesses
        in the order it needs to run, both for testing and for production
    """
    
    #!/usr/bin/python3
    
    import routes
    from shutil import move
    from subprocess import run
    from os.path import join, exists
    
    
    def sync_files():
        orig = join(routes.OUTPUT, "")
        dest = join(routes.FINAL_OUTPUT, "")
        linkdest = join("..", "..", orig)
        command = ("rsync", "-ah", "--delete",
                   "--link-dest={}".format(linkdest), orig, dest)
        reallinkdest = join(dest, linkdest)
        if(exists(reallinkdest)):
            #print("{} exists".format(reallinkdest))
            run(command)
        else:
            print("{} doesnt exist".format(reallinkdest))
            print("its very likely the command is wrong:\n{}".format(command))
            exit(1)
    
    
    def test_run():
        f = '.files_cache_files'
        if(exists(f)):
            #move(f, 'todelete')
            pass
        r = routes.Routes("real.files")
        # print(r)
        r.move_files()
        r.generate_markdown()
    
        sync_files()
    
    
    def final_run():
        r = routes.Routes("/media/usb/web/")
        # print(routes)
        r.move_files()
        r.generate_markdown()
    
        sync_files()
    
    
    test_run()
    # final_run()
    
    #!/usr/bin/python3
    """
    routes.py
    
        Generate the different information and intermediate cache files so it doesnt
        have to process everything every time
    """
    
    try:
        from slugify import slugify
    except ImportError as e:
        print(e)
        print("Missing module. Please install python3-slugify")
        exit()
    
    from pprint import pformat
    from shutil import copy
    from os.path import join, exists, basename, splitext
    import os
    import re
    import json
    
    # original files path
    ORIG_BASE = "/media/usb0/web/"
    ORIG_BASE = "files"
    ORIG_BASE = "real.files"
    # relative dest to write content
    OUTPUT = join("content", "auto", "")
    # relative dest pdf and gpx
    STATIC = join("static", "")
    FULL_STATIC = join("auto", "static", "")
    # relative photos dest
    PHOTOS = join("photos", "")
    # relative markdown dest
    PAGES = join("rutas", "")
    # relative banner dest
    BANNER = join(PHOTOS, "banner", "")
    # absolute dests
    BASE_PAGES = join(OUTPUT, PAGES, "")
    BASE_STATIC = join(OUTPUT, STATIC, "")
    BASE_PHOTOS = join(OUTPUT, PHOTOS, "")
    BASE_BANNER = join(OUTPUT, BANNER, "")
    
    TAGS = 'tags.txt'
    
    # Where to copy everything once its generated
    FINAL_OUTPUT = join("web", OUTPUT)
    
    def hard_link(src, dst):
        """Tries to hard link and copy it instead where it fails"""
        try:
            os.link(src, dst)
        except OSError:
            copy(src, dst)
    
    def sanitize_name(fpath):
        """ returns sane file names: '/á/b/c áD.dS' -> c-ad.ds"""
        fname = basename(fpath)
        split_fname = splitext(fname)
        name = slugify(split_fname[0])
        ext = slugify(split_fname[1]).lower()
        return ".".join((name, ext))
    
    class Routes():
        pdf_re = re.compile(r".*/R(\d{1,2}).*(?:PDF|pdf)$")
        gpx_re = re.compile(r".*/R(\d{1,2}).*(?:GPX|gpx)$")
        jpg_re = re.compile(r".*/\d{1,2}R(\d{1,2}).*(?:jpg|JPG)$")
        banner_re = re.compile(r".*BANNER/Etiquetadas/.*(?:jpg|JPG)$")
    
        path_re = re.compile(r".*PROVINCIA DE (.*)/\d* (.*)\ (?:CC|BA)/.*")
    
        def __getitem__(self, item):
            return self.__routes__[item]
    
        def __iter__(self):
            return iter(self.__routes__)
    
        def __str__(self):
            return pformat(self.__routes__)
    
        def __init__(self, path):
            self.__routes__ = {}
            self.__files__ = {}
    
            self.fcache = ".files_cache_" + slugify(path)
    
            if(exists(self.fcache)):
                print(f"Using cache to read. {self.fcache} detected:")
                self._read_files_cache()
            else:
                print(f"No cache detected. Reading from {path}")
                self._read_files_to_cache(path)
    
        def _init_dir(self, path, create_ruta_dirs=True):
            """ create dir estructure. Returns True if it had to create"""
            created = True
    
            if(exists(path)):
                print(f"{path} exist. No need to create dirs")
                created = False
            else:
                print(f"{path} doesnt exist. Creating dirs")
                os.makedirs(path)
                if(create_ruta_dirs):
                    self._create_ruta_dirs(path)
    
            return created
    
        def _create_ruta_dirs(self, path):
            """Create structure of directories in <path>"""
            for prov in self.__routes__:
                prov_path = join(path, slugify(prov))
                if(not exists(prov_path)):
                    os.makedirs(prov_path)
                for comar in self.__routes__[prov]:
                    comar_path = join(prov_path, slugify(comar))
                    if(not exists(comar_path)):
                        os.makedirs(comar_path)
                    # Special case for BASE_PAGES. Dont make last ruta folder
                    if(path != BASE_PAGES):
                        for ruta in self.__routes__[prov].get(comar):
                            ruta_path = join(comar_path, ruta)
                            if(not exists(ruta_path)):
                                os.makedirs(ruta_path)
    
        def _read_files_cache(self):
            with open(self.fcache) as f:
                temp = json.load(f)
            self.__routes__ = temp['routes']
            self.__files__ = temp['files']
    
        def _read_files_to_cache(self, path):
            """read files from path into memory. Also writes the cache file"""
            """also read tags"""
            for root, subdirs, files in os.walk(path):
                for f in files:
    
                    def append_ruta_var(match, var_name):
                        prov, comar = self._get_prov_comar(root)
                        ruta = match.group(1).zfill(2)
                        var_path = join(root, f)
                        r = self._get_ruta(prov, comar, ruta)
                        r.update({var_name: var_path})
    
                    def append_ruta_pic(match):
                        prov, comar = self._get_prov_comar(root)
                        ruta = match.group(1).zfill(2)
                        pic_path = join(root, f)
                        r = self._get_ruta(prov, comar, ruta)
                        pics = r.setdefault('pics', list())
                        pics.append(pic_path)
    
                    def pdf(m):
                        append_ruta_var(m, 'pdf_orig')
    
                    def gpx(m):
                        append_ruta_var(m, 'gpx_orig')
    
                    def append_banner(m):
                        pic_path = join(root, f)
                        banner = self.__files__.setdefault('banner', list())
                        banner.append(pic_path)
    
                    regexes = (
                        (self.banner_re, append_banner),
                        (self.pdf_re, pdf),
                        (self.gpx_re, gpx),
                        (self.jpg_re, append_ruta_pic),
                    )
    
                    for reg, func in regexes:
                        try:
                            match = reg.match(join(root, f))
                            if(match):
                                func(match)
                                break
                            # else:
                            #    print(f"no match for {root}/{f}")
                        except Exception:
                            print(f"Not sure how to parse this file: {f}")
                            print(f"r: {root}\ns: {subdirs}\nf: {files}\n\n")
    
            self._read_tags()
    
            temp = dict({'routes': self.__routes__, 'files': self.__files__})
            with open(self.fcache, "w") as f:
                json.dump(temp, f)
    
        def _read_tags(self):
            with open(TAGS) as f:
                for line in f.readlines():
                    try:
                        ruta, short_name, long_name, tags = [
                            p.strip() for p in line.split(":")]
                        prov, comar, number, _ = ruta.split("/")
                        r = self._get_ruta(prov, comar, number)
                        r.update({'short': short_name})
                        r.update({'long': long_name})
                        final_tags = list()
                        for t in tags.split(","):
                            final_tags.append(t)
                        r.update({'tags': final_tags})
                    except ValueError:
                        pass
    
        def _get_prov_comar(self, path):
            pathm = self.path_re.match(path)
            prov = pathm.group(1)
            comar = pathm.group(2)
    
            return prov, comar
    
        def _get_ruta(self, prov, comar, ruta):
            """creates the intermeidate dics if needed"""
    
            prov = slugify(prov)
            comar = slugify(comar)
    
            p = self.__routes__.get(prov)
            if(not p):
                self.__routes__.update({prov: {}})
    
            c = self.__routes__.get(prov).get(comar)
            if(not c):
                self.__routes__.get(prov).update({comar: {}})
    
            r = self.__routes__.get(prov).get(comar).get(ruta)
            if(not r):
                self.__routes__.get(prov).get(comar).update({ruta: {}})
    
            r = self.__routes__.get(prov).get(comar).get(ruta)
            return r
    
        def move_files(self):
            """move misc (banner) and ruta related files (not markdown)"""
            """from dir to OUTPUT"""
            self._move_ruta_files()
            # misc have to be moved after ruta files, because the folder
            # inside photos prevents ruta photos to be moved
            self._move_misc_files()
    
        def _move_misc_files(self):
            if (self._init_dir(BASE_BANNER, False)):
                print("moving banner...")
    
                for f in self.__files__['banner']:
                    fname = basename(f)
                    dest = slugify(basename(f))
                    hard_link(f, join(BASE_BANNER, sanitize_name(f)))
    
        def _move_ruta_files(self):
            """move everything ruta related: static and photos(not markdown)"""
            create_static = False
            create_photos = False
    
            if (self._init_dir(BASE_STATIC)):
                print("moving static...")
                create_static = True
    
            if (self._init_dir(BASE_PHOTOS)):
                print("moving photos...")
                create_photos = True
    
            for prov in self.__routes__:
                for comar in self.__routes__[prov]:
                    for ruta in self.__routes__[prov].get(comar):
                        r = self.__routes__[prov].get(comar).get(ruta)
                        fbase_static = join(
                            BASE_STATIC, prov, slugify(comar), ruta)
                        fbase_photos = join(
                            BASE_PHOTOS, prov, slugify(comar), ruta)
    
                        def move_file(orig, dest):
                            whereto = join(dest, sanitize_name(orig))
                            hard_link(orig, whereto)
    
                        if(create_static):
                            for fkey in ("pdf_orig", "gpx_orig"):
                                if(fkey in r):
                                    move_file(r[fkey], fbase_static)
    
                        if(create_photos and ("pics") in r):
                            for pic in r["pics"]:
                                move_file(pic, fbase_photos)
    
        def generate_markdown(self):
            """Create markdown in the correct directory"""
            self._init_dir(BASE_PAGES)
            for prov in self.__routes__:
                for comar in self.__routes__[prov]:
                    for ruta in self.__routes__[prov].get(comar):
                        r = self.__routes__[prov].get(comar).get(ruta)
                        pages_base = join(
                            BASE_PAGES, prov, slugify(comar))
                        fpath = join(pages_base, f"{ruta}.md")
    
                        photos_base = join(prov, slugify(comar), ruta)
                        static_base = join(
                            FULL_STATIC, prov, slugify(comar), ruta)
    
                        with open(fpath, "w") as f:
                            title = "Title: "
                            if('long' in r):
                                title += r['long']
                            else:
                                title += f"{prov} - {comar} - Ruta {ruta}"
                            f.write(title + "\n")
                            f.write(f"Path: {ruta}\n")
                            f.write("Date: 2018-01-01 00:00\n")
                            if('tags' in r):
                                f.write("Tags: {}".format(", ".join(r['tags'])))
                                f.write("\n")
                            f.write("Gallery: {photo}")
                            f.write(f"{photos_base}\n")
    
                            try:
                                fpath = join("/", static_base, sanitize_name(r['pdf_orig']))
                                f.write( f'Pdf: {fpath}\n')
                            except KeyError:
                                f.write('Esta ruta no tiene descripcion (pdf)\n\n')
    
    
                            try:
                                fpath = join("/", static_base, sanitize_name(r['gpx_orig']))
                                f.write(f"Gpx: {fpath}\n")
                            except KeyError:
                                f.write('Esta ruta no tiene coordenadas (gpx)\n\n')
    
    
                            if('pics' not in r):
                                f.write('Esta ruta no tiene fotos\n\n')
    
    
    
    if __name__ == "__main__":
        routes = Routes(ORIG_BASE)
        # print(routes)
        print("done reading")
        routes.move_files()
        routes.generate_markdown()
        print("done writing")
    
  4. Destructive git behaviour

    fun with git

    I destroyed all the work I had done in a project for the last 2 months

    tl;dr:
    GIT doesnt consider the files in .gitignore important and will happily replace them

    Im pretty careless with my local git commands

    Ive been trained by git to be this careless. Unless i use --force on a command, git will always alert me if im about to do something destructive. Even then, worse case scenario, you can use git reflog to get back in time after a bad merge or something not easily accesible with a normal git flow

    What happened?

    I had a link to a folder in my master branch. I branched to do some work and decided to replace the link with the actual folder to untangle some other mess and added it to .gitignore to avoid git complaining about it

    Then happily worked on in for 2 months

    I was ready to merge it, so I made a final commit and I checked out master

    So far, pretty normal git flow… right?

    But wait, something was wrong. My folder was missing!

    Wait, what?! what happened!

    The folder existed as a syslink on master, so git happily replaced my folder with a now broken syslink

    It seems git doesnt consider files under .gitignore as important

    You can see by yourself and reproduce this behaviour by typing the following commands. It doesnt matter if links doesnt exists:

    [~/tmp]
    $ mkdir gitdestroy/
    
    [~/tmp]
    $ cd gitdestroy/
    
    [~/tmp/gitdestroy]
    $ cat > file1
    hi, im file1
    
    [~/tmp/gitdestroy]
    $ ln -s nofile link
    
    [~/tmp/gitdestroy]
    $ ll
    total 48K
    drwxr-xr-x. 26 alberto alberto  36K Jan 29 15:18 ..
    -rw-r--r--   1 alberto alberto   13 Jan 29 15:19 file1
    lrwxrwxrwx   1 alberto alberto    6 Jan 29 15:19 link -> nofile
    drwxr-xr-x   2 alberto alberto 4.0K Jan 29 15:19 .
    
    [~/tmp/gitdestroy]
    $ git init
    Initialized empty Git repository in /home/alberto/tmp/gitdestroy/.git/
    
    [~/tmp/gitdestroy (master #%)]
    $ git add -A
    
    [~/tmp/gitdestroy (master +)]
    $ git status
    On branch master
    
    No commits yet
    
    Changes to be committed:
      (use "git rm --cached <file>..." to unstage)
    
        new file:   file1
        new file:   link
    
    
    [~/tmp/gitdestroy (master +)]
    $ git commit -m "link on repo"
    [master (root-commit) 5001c61] link on repo
     2 files changed, 2 insertions(+)
     create mode 100644 file1
     create mode 120000 link
    
    [~/tmp/gitdestroy (master)]
    $ git checkout -b branchwithoutlink
    Switched to a new branch 'branchwithoutlink'
    
    [~/tmp/gitdestroy (branchwithoutlink)]
    $ git rm link 
    rm 'link'
    
    [~/tmp/gitdestroy (branchwithoutlink +)]
    $ mkdir link
    
    [~/tmp/gitdestroy (branchwithoutlink +)]
    $ cat >link/file2
    hi im file2
    
    [~/tmp/gitdestroy (branchwithoutlink +%)]
    $ cat > .gitignore
    link
    
    [~/tmp/gitdestroy (branchwithoutlink +%)]
    $ git status
    On branch branchwithoutlink
    Changes to be committed:
      (use "git reset HEAD <file>..." to unstage)
    
        deleted:    link
    
    Untracked files:
      (use "git add <file>..." to include in what will be committed)
    
        .gitignore
    
    
    [~/tmp/gitdestroy (branchwithoutlink +%)]
    $ git add -A
    
    [~/tmp/gitdestroy (branchwithoutlink +)]
    $ git commit -m "replace link with folder"
    
    [branchwithoutlink 2cfb06c] replace link with folder
     2 files changed, 1 insertion(+), 1 deletion(-)
     create mode 100644 .gitignore
     delete mode 120000 link
    
    [~/tmp/gitdestroy (branchwithoutlink)]
    $ ll
    total 60K
    drwxr-xr-x. 26 alberto alberto  36K Jan 29 15:18 ..
    -rw-r--r--   1 alberto alberto   13 Jan 29 15:19 file1
    drwxr-xr-x   2 alberto alberto 4.0K Jan 29 15:21 link
    drwxr-xr-x   4 alberto alberto 4.0K Jan 29 15:22 .
    -rw-r--r--   1 alberto alberto    5 Jan 29 15:22 .gitignore
    drwxr-xr-x   8 alberto alberto 4.0K Jan 29 15:22 .git
    
    [~/tmp/gitdestroy (branchwithoutlink)]
    $ git checkout master
    Switched to branch 'master'                                        <--- NO ERROR???
    
    [~/tmp/gitdestroy (master)]
    $ ll
    total 52K
    drwxr-xr-x. 26 alberto alberto  36K Jan 29 15:18 ..
    -rw-r--r--   1 alberto alberto   13 Jan 29 15:19 file1
    lrwxrwxrwx   1 alberto alberto    6 Jan 29 15:22 link -> nofile    <--- WHAT
    drwxr-xr-x   8 alberto alberto 4.0K Jan 29 15:22 .git
    drwxr-xr-x   3 alberto alberto 4.0K Jan 29 15:22 .
    
    [~/tmp/gitdestroy (master)]
    $ git checkout branchwithoutlink 
    Switched to branch 'branchwithoutlink'
    
    [~/tmp/gitdestroy (branchwithoutlink)]
    $ ll
    total 56K
    drwxr-xr-x. 26 alberto alberto  36K Jan 29 15:18 ..
    -rw-r--r--   1 alberto alberto   13 Jan 29 15:19 file1
    -rw-r--r--   1 alberto alberto    5 Jan 29 15:23 .gitignore
    drwxr-xr-x   8 alberto alberto 4.0K Jan 29 15:23 .git
    drwxr-xr-x   3 alberto alberto 4.0K Jan 29 15:23 .
    

    Aftermath

    I analyzed what git was doing underneath in hopes to gain some insight on how to recover these files. It seems git unlinkat(2) everyfile and finally rmdir(2) the folder

    By contrasts rm(1) just uses unlinkat(2) in every file and folder

    Not sure what difference this makes, but it was quite useless. I tried some EXT undelete tools to try to recover the missing files, but everything was gone

    Actually I was able to undeleted some files i had removed 3 years ago that i didnt need :/

    Future

    This directory was under git as well and remotely hosted. But my last push was 2 months ago. I will be more careful on the future

    Recently theres been some discussion on git about something that could prevent this behaviour. They are introducing the concept of “precious ignored” files

    But for me the damage was done

    This was unexpected behaviour for me. Maybe it was also for you. Be safe out there!

  5. New web Design

    I hope you like it. I remade the theme from scratch using css grid to make the web responsive. Responsive means that it adapts to the screen size, so it works both on large screens and on phones

    I tried to keep the old look as much as possible. So in the worst case scenario, you wont notice any change :)

    Web technologies have really go a long way. It used to be a nightmare. Nowadays, dare I say that is a pleasure and fun to make your own design

  6. Gmail mangles .gpg files

    Why?

    I dont know

    If you change bytes in a .gpg somebody is bound to notice. Right?

    Im using a 3rd party to send a .gpg to a gmail account and the checksums before and after simply dont match

    I dont want to really assume evilness, since modifiying bytes on the attachments seems pretty sketchy

    Maybe im doing something wrong

    The fact that the checksums are okay if I send it to my personal server using the same 3rd party mail provider its a little suspicious tho

    Ive been told that

    [...], email and gmail are different things.  So I wouldn't be surprised if they are not 100% compatible ;-)
    

    Funny, or is it? google has most of my email because it has all of yours. They have a lot of leverage to define the email experience

    In the end, just renaming the .gpg to something else fixed it (What??)

    And while we are still ranting about google, lets finish with a pet peeve of mine

    I hate that they virtually remove the concept of domain or email address. Not that they are just the f*cking anchor point to security

    Security and, you know, knowing where the f*ck you are going or who you are trying to get in contact with

    Instead they hide this info as much as possible in labels so you learn to trust them, instead of learning to trust something as simple and as ubiquitous as a domain or an email

  7. Virus, Qubes-OS and Debian

    computer problems that people attribute to virus doesnt overlap with real problems caused by virus

    This is the virus venn diagram. Its pretty accurate and many people, including people that gets along with technology, is oblivious to it. Voluntarily installing crap by installing random programs you just googled in your computer hardly counts as a virus

    Sometimes they overlap tho. What I call “trawling viruses”. Using some very old exploit that should hardly work on anybody and spamming it, you can still get lots of people that never update. In this case, you dont care about anything, you just try get a quick profit and you dont really care if you slow down the target machine

    But by and large, virus try to be as invisible as possible, do their bussiness and go undetected for as long as possible. If they can make an optimization to your system, like patching how they got in, they will

    Using debian is one way to protect yourself… but they still fall short because it still uses a very old authorization model

    Authorization model in computers is old

    Its no secret that the authorization model in computers is really old

    Qubes-os is a system that tries to mitigates that problem quite sucessfully. Qubes-os 4.0 rc1 has been released recently. Im currently testing it on my mediabox, and will probably use it in my main machine soon

    Holger gave a talk a few weeks ago named “Using qubes os from the pov of a debian developer”. In debconf fashion you can watch it online

¡ En Español !