nullByteMe
nullByteMe

Reputation: 6391

How can I specify a row in awk in for loop?

I'm using the following awk command:

my_command | awk -F "[[:space:]]{2,}+" 'NR>1 {print $2}' | egrep "^[[:alnum:]]"

which successfully returns my data like this:

fileName1
file Name 1
file Nameone
f i l e Name 1

So as you can see some file names have spaces. This is fine as I'm just trying to echo the file name (nothing special). The problem is calling that specific row within a loop. I'm trying to do it this way:

i=1
for num in $rows
do
  fileName=$(my_command | awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]])"
  echo "$num $fileName"
  $((i++))
done

But my output is always null

I've also tried using awk -v record=$i and then printing $record but I get the below results.

f i l e Name 1

EDIT

Sorry for the confusion: rows is a variable that list ids like this 11 12 13 and each one of those ids ties to a file name. My command without doing any parsing looks like this:

     id      File Info      OS
     11      File Name1     OS1
     12      Fi leNa me2    OS2
     13      FileName 3     OS3

I can only use the id field to run a the command that I need, but I want to use the File Info field to notify the user of the actual File that the command is being executed against.

Upvotes: 1

Views: 3160

Answers (4)

konsolebox
konsolebox

Reputation: 75458

I think your $i does not expand as expected. You should quote your arguments this way:

  fileName=$(my_command | awk -F "[[:space:]]{2,}+" "NR==$i {print \$2}" | egrep "^[[:alnum:]]")

And you forgot the other ).

EDIT

As an update to your requirement you could just pass the rows to a single awk command instead of a repeatitive one inside a loop:

#!/bin/bash

ROWS=(11 12)

function my_command {
    # This function just emulates my_command and should be removed later.
    echo "     id      File Info      OS
     11      File Name1     OS1
     12      Fi leNa me2    OS2
     13      FileName 3     OS3"
}

awk -- '
    BEGIN {
        input = ARGV[1]
        while (getline line < input) {
            sub(/^ +/, "", line)
            split(line, a, /   +/)
            for (i = 2; i < ARGC; ++i) {
                if (a[1] == ARGV[i]) {
                    printf "%s %s\n", a[1], a[2]
                    break
                }
            }
        }
        exit
    }
' <(my_command) "${ROWS[@]}"

That awk command could be condensed to one line as:

awk -- 'BEGIN { input = ARGV[1]; while (getline line < input) { sub(/^ +/, "", line); split(line, a, /   +/); for (i = 2; i < ARGC; ++i) { if (a[1] == ARGV[i]) {; printf "%s %s\n", a[1], a[2]; break; }; }; }; exit; }' <(my_command) "${ROWS[@]}"

Or better yet just use Bash instead as a whole:

#!/bin/bash

ROWS=(11 12)

while IFS=$' ' read -r LINE; do
    IFS='|' read -ra FIELDS <<< "${LINE//  +( )/|}"
    for R in "${ROWS[@]}"; do
        if [[ ${FIELDS[0]} == "$R" ]]; then
            echo "${R} ${FIELDS[1]}"
            break
        fi
    done
done < <(my_command)

It should give an output like:

11 File Name1
12 Fi leNa me2

Upvotes: 2

Mark Reed
Mark Reed

Reputation: 95242

It's pretty inefficient to rerun my_command (and awk) every time through the loop just to extract one line from its output. Especially when all you're doing is printing out part of each line in order. (I'm assuming that my_command really is exactly the same command and produces the same output every time through your loop.)

If that's the case, this one-liner should do the trick:

paste -d' ' <(printf '%s\n' $rows) <(my_command | 
  awk -F '[[:space:]]{2,}+' '($2 ~ /^[::alnum::]/) {print $2}')

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203209

As you already heard, you need to populate an awk variable from your shell variable to be able to use the desired value within the awk script so thi:

awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]]"

should be this:

awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"

Also, though, you don't need awk AND grep since awk can do anything grep van do so you can change this part of your script:

awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"

to this:

awk -v i="$i" -F "[[:space:]]{2,}+" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'

and you don't need a + after a numeric range so you can change {2,}+ to just {2,}:

awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'

Most importantly, though, instead of invoking awk once for every invocation of my_command, you can just invoke it once for all of them, i.e. instead of this (assuming this does what you want):

i=1
for num in rows
do
  fileName=$(my_command | awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}')
  echo "$num $fileName"
  $((i++))
done

you can do something more like this:

for num in rows
do
  my_command
done |
awk -F '[[:space:]]{2,}' '$2~/^[[:alnum:]]/{print NR, $2}'

I say "something like" because you don't tell us what "my_command", "rows" or "num" are so I can't be precise but hopefully you see the pattern. If you give us more info we can provide a better answer.

Upvotes: 1

Barmar
Barmar

Reputation: 780724

Shell variables aren't expanded inside single-quoted strings. Use the -v option to set an awk variable to the shell variable:

fileName=$(my_command | awk -v i=$i -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]])"

This method avoids having to escape all the $ characters in the awk script, as required in konsolebox's answer.

Upvotes: 2

Related Questions