Ruben Andre Sviggum
Ruben Andre Sviggum

Reputation: 53

AWK: How to exactly match and print multiple words between two keywords in the same line

Consider a text file named "nett" with the following content:

admin@(none):/tmp/home/root# cat nett
BSSID: 00:22:07:29:D4:23 RSSI: -71 dBm Band: 2.4GHz Channel: 1 802.11: b/g/n SSID: Inteno_24  noise: -70
BSSID: 00:19:77:12:97:94 RSSI: -54 dBm Band: 2.4GHz Channel: 1 802.11: b/g/n SSID: AK-Gjester  noise: -70
BSSID: 00:19:77:12:97:95 RSSI: -55 dBm Band: 2.4GHz Channel: 1 802.11: b/g/n SSID: AK-Ansatt  noise: -70
BSSID: 02:26:16:B2:37:AD RSSI: -73 dBm Band: 2.4GHz Channel: 6 802.11: b/g SSID: Trimble Service (5132555899)  noise: -87
BSSID: FA:8F:CA:88:F9:8E RSSI: -45 dBm Band: 2.4GHz Channel: 6 802.11: b/g/n SSID: Chromecast6286  noise: -87
BSSID: 00:22:07:3F:67:6B RSSI: -86 dBm Band: 2.4GHz Channel: 13 802.11: b/g/n SSID: Inteno-676C  noise: -87

I am trying to print formatted data from this file to the terminal using awk. This is part of a much longer script. The following script illustrates the problem i need to solve:

#!/bin/sh
awk ' \
    {for(i=1;i<=NF;i++)if($i~/SSID:/)printf "%s%s", "BSSID: " $(i+1)} \
    {for(i=1;i<=NF;i++)if($i~/Channel:/)printf "%s%s\n", "Kanal: " $(i+1)}' nett

The second line with the Channel: works correctly, awk loops though the line one word at a time searching for the "Channel:" word and then prints the next word with some custom text. There may be a variable number of columns in a line, so targeting a spesific column will not always work in this case.

However, the real problem is the first line. There are two problems here that needs addressing:

1: Since there are both a word "BSSID" and "SSID" then the search pattern needs to be exact. Currently both "BSSID" and "SSID" is matched.

2: The text that i need to print out may in this case be more than one word, like in the fourth line:

BSSID: 02:26:16:B2:37:AD RSSI: -73 dBm Band: 2.4GHz Channel: 6 802.11: b/g SSID: Trimble Service (5132555899)  noise: -87

Here I need awk to locate multiple words between SSID: and noise: and print all words.

In the current state of the script I get the output:

BSSID: 02:26:16:B2:37:ADBSSID: TrimbleKanal: 6

Since awk will process the rest of the output correctly, then an pure awk solution to this will be much appreciated. Please note that output lacks spacing in the right places which is done on purpose to make the problem as compact and visible as possible.

Best regards!

Upvotes: 1

Views: 1761

Answers (6)

Juan Diego Godoy Robles
Juan Diego Godoy Robles

Reputation: 14955

Use gawk capture groups:

$ gawk 'match($0,/^BSSID:\s+(\S+).*Channel:\s+(\S).*SSID:\s+(.*noise:\s+\S+)/,a)\
  {print a[1],a[3],a[2]}' nett

Column by column:

$ gawk 'match($0,/BSSID:\s+(\S+)/,a){printf(a[1]" ")}
    match($0,/\s+SSID:\s+(.*noise:\s+\S+)/,a){printf a[1]" "}
    match($0,/\s+Channel:\s+(\S+)/,a){printf a[1]"\n"}' nett

Note \s+SSID:\s+(.*noise:\s+\S+) will capture everything between SSID and noise columns.

Results

00:22:07:29:D4:23 Inteno_24  noise: -70 1
00:19:77:12:97:94 AK-Gjester  noise: -70 1
00:19:77:12:97:95 AK-Ansatt  noise: -70 1
02:26:16:B2:37:AD Trimble Service (5132555899)  noise: -87 6
FA:8F:CA:88:F9:8E Chromecast6286  noise: -87 6
00:22:07:3F:67:6B Inteno-676C  noise: -87 1

Check gawk documentation.

Upvotes: 3

user4401178
user4401178

Reputation:

You can process it in one loop just assign them to variables, this method is also adaptable to other situations. If for instance the order of the data is rearranged this will still handle it as you had before with multiple loops.

#!/bin/sh
awk ' \
    {for(i=1;i<=NF;i++) {
    if($i~/Channel:/)chan="Kanal: "$(i+1)
    if($i~/^SSID:/)ssid="SSID: "$(i+1)
    if($i~/^BSSID:/)bssid="BSSID: "$(i+1)
    if($i~/^noise:/)noise="noise: "$(i+1)
    if($i~/^RSSI:/)rssid="RSSI: "$(i+1)
    if ((length(chan)>1) && (length(ssid)>1) && (length(bssid)>1) && (length(noise)>1) && (length(rssid)>1)) { 
     printf "%-30s %-24s %-10s dBm, %s\n",ssid,bssid,rssid,noise,chan
     chan="";ssid="";bssid="";noise="";rssid=""
    }
} }' "${1}"  

Fyi: I took your test data and cat'd it to itself 20k times. Then ran a test using your solution with multiple loops vs. the above and the result was:

cp awk_test.txt awk_test_tmp.txt; for i in {1..20000}; do cat awk_test_tmp.txt >> awk_test.txt; done

$> time selected_solution.awk awk_test.txt 1>/dev/random

real    0m5.944s
user    0m5.500s
sys 0m0.440s

$> time my.awk awk_test.txt 1>/dev/random

real    0m4.733s
user    0m4.382s
sys 0m0.347s

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203985

Given your posted answer, here is the sensible way to implement the awk part from your question (and the shell print '\n' after it in your answer):

$ cat tst.sh
awk '
BEGIN { FS=": +" }
{
    for (i=1;i<=NF;i++) {
        value = $i; sub(/ +[^ ]+$/,"",value)
        n2v[name] = value
        name = $i; sub(/.* /,"",name)
    }
    printf "SSID: %-32s",    n2v["SSID"]
    printf "BSSID: %-20s",   n2v["BSSID"]
    printf "RSSI: %s, ",     n2v["RSSI"]
    printf "noise: %s dBm ", n2v["noise"]
    printf "Kanal: %-2s\n",  n2v["Channel"]
}
END { print "" }
' nett

$ ./tst.sh
SSID: Inteno_24                       BSSID: 00:22:07:29:D4:23   RSSI: -71 dBm, noise: -70 dBm Kanal: 1
SSID: AK-Gjester                      BSSID: 00:19:77:12:97:94   RSSI: -54 dBm, noise: -70 dBm Kanal: 1
SSID: AK-Ansatt                       BSSID: 00:19:77:12:97:95   RSSI: -55 dBm, noise: -70 dBm Kanal: 1
SSID: Trimble Service (5132555899)    BSSID: 02:26:16:B2:37:AD   RSSI: -73 dBm, noise: -87 dBm Kanal: 6
SSID: Chromecast6286                  BSSID: FA:8F:CA:88:F9:8E   RSSI: -45 dBm, noise: -87 dBm Kanal: 6
SSID: Inteno-676C                     BSSID: 00:22:07:3F:67:6B   RSSI: -86 dBm, noise: -87 dBm Kanal: 13

$

The above will work with any awk in any OS.

Upvotes: 4

Ruben Andre Sviggum
Ruben Andre Sviggum

Reputation: 53

Here is what I ended up with: (all comments, variables and filenames in norwegian). This particular solution can be lookup up by searching for "# Presenter nabonettverk"

#!/bin/sh

# Sletter eventuelle eldre filer
if [ -e linje ] ; then rm linje ; fi
if [ -e temp_assoc ] ; then rm temp_assoc ; fi
if [ -e temp_assoc_dhcp ] ; then rm temp_assoc_dhcp ; fi
if [ -e temp_assoc_static ] ; then rm temp_assoc_static ; fi
if [ -e bssid ] ; then rm bssid ; fi
if [ -e noise ] ; then rm noise ; fi
if [ -e temp_mac ] ; then rm temp_mac ; fi
if [ -e temp_mac2 ] ; then rm temp_mac2 ; fi
if [ -e kabel_dhcp ] ; then rm kabel_dhcp ; fi
if [ -e kabel_statisk ] ; then rm kabel_statisk ; fi

# Presenter linjedata
clear
cat /etc/banner | grep STY
printf '%-72s\n' "============================== LINJE ============================="
adsl info --stats | grep -m 1 -B 1 -i "Bearer:" > linje
adsl info --stats | grep -i mode -m 1 >> linje
adsl info --stats | grep -w -A 3 -i down >> linje
adsl info --stats | grep -m 1 -i hec >> linje
adsl info --stats | grep -A 2 -i "total time" >> linje
adsl info --stats | grep -A 2 -i since >> linje
cat linje
rm linje

# Presenter tilkoblinger
printf '\n%91s\n' "=================================== TILKOBLINGER =========================================="
printf '%-13s\t%-6s%-6s\t%-6s%-6s\t%-6s%-6s\t%-6s%-6s\t%-6s%-6s\n' \
    "Grensesnitt:" \
    "WLAN:" \
    "$(if [ "$(cat /sys/class/net/wl0/operstate)" == "unknown" ] ; then printf '%s' "aktiv" ; fi)" \
    "LAN1:" \
    "$(if [ "$(cat /sys/class/net/eth4/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)" \
    "LAN2:" \
    "$(if [ "$(cat /sys/class/net/eth3/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)" \
    "LAN3:" \
    "$(if [ "$(cat /sys/class/net/eth2/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)" \
    "LAN4:" \
    "$(if [ "$(cat /sys/class/net/eth1/operstate)" == "up" ] ; then printf '%s' "aktiv" ; else printf '%s' "av" ; fi)"

# Finn antall og type enheter
total=$(brctl showmacs br-lan | grep -v 00:22:07 | grep -v 02:22:07 | tail +2 | wc -l)
wlan=$(wlctl assoclistinfo | tail -n +3 | awk '{print $2}' | wc -l)
kabel=$(( $total-$wlan ))

# Presenter enheter
printf '%-30s\t%-8s%-3s\t%-6s%-3s\t%-8s%-3s\n\n' "Aktive nettverkstilkoblinger:" "Totalt:" "$total" "WLAN:" "$wlan" "Kablet:" "$kabel"
total=

# Dersom trådløs er aktivert
if [ "$(cat /sys/class/net/wl0/operstate)" == "unknown" ]
    then

    # Hent alle trådløse enheters MAC-adresse
    wlctl assoclistinfo | tail -n +3 | awk '{print $2}' > temp_assoc
    printf '%s\n' "================================================= WIRELESS =================================================="

    # Antennehastighet
    wlctl rate > wlrate 

    # Hent modemets ssid, bakgrunnsstøy og kanal. Presenter resultat.
    wlctl status | grep -B 1 -i mode | awk 'NR%2{printf $0" ";next;}1' | awk -F'"' '{print $2}' >> modem_ssid
    wlctl status | grep -B 1 -i mode | awk 'NR%2{printf $1" ";next;}1' | awk {'print " Bakgrunnstoy(noise): " $11"dBm, kanal " $14'} >> modem_ssid2
    printf '%-6s%-32s%-18s%-9s%s\n' "SSID: " "$(cat modem_ssid)" "Antennehastighet: " "$(cat wlrate)" "$(cat modem_ssid2)"
    rm modem_ssid modem_ssid2 wlrate

    # Hent andre trådløse nettverk og skriv til bssid (fil) og bakgrunnsstøy til noise (fil) dersom dette finnes
    if [ -n "$(wlctl scanresults_summary)" ]
        then
        wlctl scanresults_summary >> bssid
        wlctl scanresults | grep -B 1 -i mode | sed '/^--$/d' | awk 'NR%2{printf $1" ";next;}1' | awk '{for(i=1;i<=NF;i++)if($i~/noise:/)print $(i+1)}' >> noise
        else
        printf '\n'
    # /if [ -n "$(wlctl scanresults_summary)" ]
    fi  

    # List opp nabonettverk. Dersom bssid (fil) finnes, så finnes også noise (fil)
    if [ -e bssid ]
        then
        printf '%s\n' "Andre nettverk ----------------------------------------------------------------------------------------------"

        # Flett sammen bssid (fil) og noise (fil) til nett (fil)
        exec 6<noise
        while read -r line ; do
            read -r f2line <&6
            echo $line " noise: "$f2line >> nett
        done < bssid 
        exec 6<&-
        rm bssid noise

        # Presenter nabonettverk
        awk ' \
            {for(i=1;i<=NF;i++)if($i~/^SSID:/){ s=index($0," SSID");e=index($0," noise"); printf "SSID%-34s", substr($0,s+5,e-(s+6))}} \
            {for(i=1;i<=NF;i++)if($i~/BSSID:/)printf "BSSID: %-20s", $(i+1)} \
            {for(i=1;i<=NF;i++)if($i~/RSSI:/)printf "RSSI: %s dBm, ", $(i+1)} \
            {for(i=1;i<=NF;i++)if($i~/noise:/)printf "noise: %s dBm ", $(i+1)} \
            {for(i=1;i<=NF;i++)if($i~/Channel:/)printf "%-7s%-2s\n", "Kanal: ", $(i+1)}' nett
        printf '\n'
        rm nett
    # /if [ -e bssid ]
    fi 

    # Sorter wifi enheter i dhcp og statisk liste
    while read enhet; do
    if [ -z "$(grep -i $enhet /tmp/dhcp.leases)" ]
        then
                printf '%s\n' "$enhet" >> temp_assoc_static
        else
                printf '%s\n' "$enhet" >> temp_assoc_dhcp
    fi
    done < temp_assoc
    rm temp_assoc

    # Dersom trådløse maskiner finnes i DHCP-tabellen presenteres disse som dynamisk satte klienter
    if [ -e temp_assoc_dhcp ]
        then
        printf '%s\n' "Klienter (DHCP)----------------------------------------------------------------------------------------------"
        while read enhet; do
            printf '%-6s%-32s%-5s%-19s%-4s%-16s%-20s%-4s%-3s\n' \
                "Navn: " \
                "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $4'})" \
                "MAC: " \
                "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $2'})" \
                "IP: " \
                "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $3'})" \
                "Signalstyrke(RSSI):" \
                "$(wlctl rssi $enhet)" \
                "dBm"
        done < temp_assoc_dhcp
        rm temp_assoc_dhcp
    fi

    # Dersom trådløse maskiner ikke finnes i DHCP-tabellen presenteres disse som statisk satte klienter
    if [ -e temp_assoc_static ]
        then
        printf '%s\n' "Klienter (Statisk) ------------------------------------------------------------------------------------------"
        while read enhet; do
        printf '%-38s%-5s%-19s%-4s%-16s%-20s%-4s%-3s\n' \
            "Statisk (navn ikke synlig)" \
            "MAC:" \
            "$enhet"  \
            "IP:" \
            "$(grep -i $enhet /proc/net/arp | awk '{print $1}')" \
            "Signalstyrke(RSSI):" \
            "$(wlctl rssi $enhet)" \
            "dBm"
        done < temp_assoc_static
        rm temp_assoc_static
    fi
# /if [ "$(cat /sys/class/net/wl0/operstate)" == "unknown" ]
fi

# Dersom minst en kablet maskin eksisterer
if [ "$kabel" -gt 0 ]
    then

    # Hent alle MAC-adresser tilkoblet
    brctl showmacs br-lan | grep -v 00:22:07 | grep -v 02:22:07 | tail -n +2 | awk '{print $2}' > temp_mac

    # Dersom minst en trådløs maskin
    if [ "$wlan" -gt 0 ]
        then
        wlctl assoclistinfo | tail -n +3 | awk '{print $2}' > temp_assoc

        # Fjern alle trådløse MAC fra den generelle listen
        while read enhet ; do
        if [ -z "$(grep -i $enhet temp_assoc)" ]
            then
            printf '%s\n' "$enhet" >> temp_mac2 
        fi
        done < temp_mac
        rm temp_mac
        rm temp_assoc
    # /if [ "$wlan" -gt 0 ]
    fi

    # Dersom ingen trådløse klienter bruker vi opprinnelig MAC liste
    if ! [ -e temp_mac2 ]
        then
        mv temp_mac temp_mac2
    fi

    # Sorter kablede enheter i dhcp og statisk liste
    while read enhet; do
    if [ -z "$(grep -i $enhet /tmp/dhcp.leases)" ]
        then
                printf '%s\n' "$enhet" >> kabel_statisk
        else
                printf '%s\n' "$enhet" >> kabel_dhcp
    fi
    done < temp_mac2
    rm temp_mac2

    # Har nå enten liste over dynamisk tildelte klienter eller statisk satte klienter eller begge deler
    printf '%-110s\n' "==================================================  KABEL ==================================================="

    # Presenter kablede dynamiske klienter
    if [ -e kabel_dhcp ]
        then
        printf '%s\n' "Klienter (DHCP)----------------------------------------------------------------------------------------------"
        while read enhet ; do
        printf '%-6s%-32s%-5s%-19s%-4s%-16s\n' \
            "Navn: " \
            "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $4'})" \
            "MAC: " \
            """$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $2'})" \
            "IP: " \
            "$(cat /tmp/dhcp.leases | grep -i $enhet | awk {'printf $3'})"
        done < kabel_dhcp
        rm kabel_dhcp
    # /if [ -e kabel_dhcp ]
    fi

    # Presenter kablede statiske klienter
    if [ -e kabel_statisk ]
        then
        printf '%s\n' "Klienter (Statisk) ------------------------------------------------------------------------------------------"
        while read enhet ; do
        printf '%-38s%-5s%-19s%-4s%-16s\n' \
            "Statisk (navn ikke synlig)" \
            "MAC:" \
            "$enhet" \
            "IP:" \
            "$(grep -i $enhet /proc/net/arp | awk '{print $1}')"
        done < kabel_statisk
        rm kabel_statisk
    # /if [ -e kabel_statisk ]
    fi
# /if [ "$kabel" -gt 0 ]
fi

# Opprydding
kabel=
wlan=
printf '\n\n%s\n\n' "WIFI nettverkslisten kan oppdateres manuelt med 'wlctl scan', merk at dette tar 15 sekunder."

Output:

IOP Version: DG150-WU7P2U_STY2.4.11RC1-150127_1608
============================== LINJE =============================
Max:    Upstream rate = 1277 Kbps, Downstream rate = 15388 Kbps
Bearer: 0, Upstream rate = 669 Kbps, Downstream rate = 7198 Kbps
Mode:                   ADSL2+ Annex B
                Down            Up
SNR (dB):        9.4             19.2
Attn(dB):        40.5            23.2
Pwr(dBm):        19.5            13.1
HEC:            1655            0
Total time = 49 days 16 hours 25 sec
FEC:            10300984                716
CRC:            1238            0
Since Link time = 4 days 14 hours 31 min 17 sec
FEC:            824368          212
CRC:            141             0

=================================== TILKOBLINGER ==========================================
Grensesnitt:    WLAN: aktiv     LAN1: av        LAN2: aktiv     LAN3: aktiv     LAN4: aktiv
Aktive nettverkstilkoblinger:   Totalt: 3       WLAN: 0         Kablet: 3

================================================= WIRELESS ==================================================
SSID: Inteno-859A                     Antennehastighet: 144 Mbps  Bakgrunnstoy(noise): -80dBm, kanal 1
Andre nettverk ----------------------------------------------------------------------------------------------
SSID: Liverpool                       BSSID: 00:21:29:0C:6E:B8   RSSI: -61 dBm, noise: -78 dBm Kanal: 6
SSID: Liverpool1                      BSSID: B8:A3:86:55:C2:9C   RSSI: -62 dBm, noise: -78 dBm Kanal: 11
SSID: Kjetil sin Chromecast           BSSID: FA:8F:CA:76:98:6A   RSSI: -64 dBm, noise: -78 dBm Kanal: 11

==================================================  KABEL ===================================================
Klienter (DHCP)----------------------------------------------------------------------------------------------
Navn: *                               MAC: 00:21:29:0c:6e:b7  IP: 192.168.1.129
Navn: DIR-655                         MAC: b8:a3:86:55:c2:9d  IP: 192.168.1.113
Klienter (Statisk) ------------------------------------------------------------------------------------------
Statisk (navn ikke synlig)            MAC: 00:10:f3:18:c2:9f  IP: 192.168.1.20


WIFI nettverkslisten kan oppdateres manuelt med 'wlctl scan', merk at dette tar 15 sekunder.

Upvotes: 1

cms
cms

Reputation: 5982

You can use some character index maths to extract the words between SSID and noise in your line 1 problem, and match SSID and BSSID separately. Ugly but it works :-/

#!/bin/sh
awk ' \
{for(i=1;i<=NF;i++)if($i~/BSSID:/)printf "BSSID%s", $(i+1)}
{for(i=1;i<=NF;i++)if($i~/^SSID:/){ s=index($0," SSID");e=index($0," noise"); printf "SSID:%s", substr($0,s+5,e-(s+6))}} \
{for(i=1;i<=NF;i++)if($i~/Channel:/)printf "%s%s\n", "Kanal: ", $(i+1)}' nett

on your input file sample gets me this output

BSSID:: Inteno_24Kanal: 1
BSSID00:19:77:12:97:94BSSID:: AK-GjesterKanal: 1
BSSID00:19:77:12:97:95BSSID:: AK-AnsattKanal: 1
BSSID02:26:16:B2:37:ADBSSID:: Trimble Service (5132555899)Kanal: 6
BSSIDFA:8F:CA:88:F9:8EBSSID:: Chromecast6286Kanal: 6
BSSID00:22:07:3F:67:6BBSSID:: Inteno-676CKanal: 13

Upvotes: 2

bian
bian

Reputation: 1456

gawk

awk 'function w(m){match($0,"\\<"m": ([^ ]*) ",a);return a[1]}{print w("BSSID"),w("SSID"),w("Channel")}' file
00:22:07:29:D4:23 Inteno_24 1
00:19:77:12:97:94 AK-Gjester 1
00:19:77:12:97:95 AK-Ansatt 1
02:26:16:B2:37:AD Trimble 6
FA:8F:CA:88:F9:8E Chromecast6286 6
00:22:07:3F:67:6B Inteno-676C 13

Upvotes: 1

Related Questions