All hail AWK

It took me a while to get around to teaching myself awk, but since learning it some years ago I have not looked back. There is quite a lot that you can do with it; while sometimes a full program (python/ruby/etc) might be more robust... There is something very satisfying about perfecting an awk 'oneliner'. (I may stretch some peoples' definition of oneliner) While I say 'awk' in this article, I use gawk as it has some nice things added.

regex is also great

If you haven't taken the time to learn regex, and some of it's more advanced features, you are missing out. It's flexiblity to handle different matches is amazing, and once you start using capture groups you will wonder why no one told you sooner.

Minor variations on a theme

You will want to note that there is a minor bit of variation between implementations of regex. (though nowhere near as bad as markdown fragmentation) This mostly consists newer implementations having a few more features bolted on, that are followed in other yet newer implementations -- you just will miss them in older programs like sed. One example would be what is referred to as perl regex, which adds some nice escape codes like: \d, \D, \w, \W, \s, \S, etc.

Here \d is a quick way to specify digit, though you can always use the old school way instead: [0-9]. With other added escape codes the old school way gets more tedious, so being able to use \w for word character is very convenient -- and \W for any NOT word character.

Today's sample

Niro, a friend on the interwebz, was asking for a convenient way of listing all IP's currently blocked by fail2ban while only listing a unique IP a single time.

My initial google fu turned up:

fail2ban-client status | grep "Jail list:" | sed "s/ //g" | awk '{split($2,a,",");for(i in a) system("fail2ban-client status " a[i])}' | grep "Status\|IP list"

This did list each jail's banned IPs... Though it also printed cruff for jails without current bans, cruff around the IPs, and didn't limit output to unique IPs. It also failed to use awk to its full potential. *points at the grep to sed to awk to... grep*

awk remix

So, lets turn that awk up to 11:

fail2ban-client status | awk '/Jail list/ {match($0, /^.*\:\s+(.*)/, m1); split(m1[1], s, ", "); for(i in s){while("fail2ban-client status " s[i] |& getline var){match(var, /IP list\:\s+(.*)/, m2); if(m2[1] != ""){h=h m2[1]}}}; gsub(/ /, "\n", h); print h | "sort -u"}'

NOTE: This is using gawk for the fancy match(){} function.

Here is the same thing broken up into a more readable format:

fail2ban-client status |                                            ## Get raw list of jails
    awk '/Jail list/                                                ## Get single relivant line
        match($0, /^.*\:\s+(.*)/, m1);                              ## Grab just the jails
        split(m1[1], s, ", ");                                      ## Make jails into usable format
        for(i in s)                                                 ## Iterate over jails
            while("fail2ban-client status " s[i] |& getline var)    ## Get raw list of IPs for each jail
                match(var, /IP list\:\s+(.*)/, m2);                 ## Grab just the IPs for each jail
                if(m2[1] != "")                                     ## Weed out non matches
                    h=h m2[1]                                       ## Append and merge IPs to list
        gsub(/ /, "\n", h);                                         ## Separate IPs with newline
        print h | "sort -u"                                         ## sort, uniq, and print IPs one per line


You should learn awk and regex. :)

capture groups

- demure