Jeanmichel Cote
Jeanmichel Cote

Reputation: 531

Calling a bash builtin function with a parameter within awk

I have this command which outputs 2 columns separated by . First column is the number of occurrence, second is the IP address. And the whole thing is sorted by ascending # of occurrence.

awk '{ips[$1]++} END {for (ip in ips) { printf "%5s %-1s %-3s\n", ips[ip], "⎟", ip}}' "${ACCESSLOG}" | sort -nk1

19 ⎟ 76.20.221.34
19 ⎟ 76.9.214.2
22 ⎟ 105.152.107.118
26 ⎟ 24.185.179.32
26 ⎟ 42.117.198.229
26 ⎟ 83.216.242.69

etc.

Now i would like to add a third column in there. In the bash shell, if you do, for instance:

host 72.80.99.43

you'll get:

43.99.80.72.in-addr.arpa domain name pointer pool-72-80-99-43.nycmny.fios.verizon.net.

So for every IP appearing in the list, i want to show in the third column its associated host. And i want to do that from within awk. So calling host from awk and passing it the parameter ip. And ideally, skipping all the standard stuff and only showing the hostname like so: nycmny.fios.verizon.net.

So my final command would look like this:

awk '{ips[$1]++} END {for (ip in ips) { printf "%5s %-1s %-3s %20s\n", ips[ip], "⎟", ip, system( "host " ip )}}' "${ACCESSLOG}" | sort -nk1

Thanks

Upvotes: 0

Views: 127

Answers (1)

Ed Morton
Ed Morton

Reputation: 203324

You wouldn't use system() since you want to combine the shell command output with your awk output, you'd call the command as a string and read it's result into a variable with getline, e.g.:

awk '{ips[$1]++}
END {
    for (ip in ips) {
        cmd = "host " ip
        if ( (cmd | getline host) <= 0 ) {
            host = "N/A"
        }
        close(cmd)
        printf "%5s %-1s %-3s %20s\n", ips[ip], "⎟", ip, host
    }
}' "${ACCESSLOG}" | sort -nk1

I assume you can figure out how to use *sub() to get just the part of the host output you care about.

Upvotes: 2

Related Questions