awk: extract data from a column by name rather than position

Question

I have a text file that is comma delimited. The first line is a list of field names, and subsequent lines contain data. I'll get new versions of the file, and I want to extract all the values from a particular column by name rather than by column number. (I.e. the column I want may be in different positions in different versions of the file.)

For example, here are two files:

foo,bar,interesting,junk
1,2,gold,ramjet
2,25,diamonds,superfluous

and

foo,bar,baz,interesting,junk,morejunk
5,3,smurf,platinum,garbage,scrap
6,2.5,mushroom,sodium,liverwurst,eew

I'd like a single script that will go through multiple files, extracting the minerals in the "interesting" column. :-)

What I've got so far is something that works on ONE file, but I know that awk is more elegant than this. How do I clean this up and make it work on multiple files at once?

BEGIN {
    FS=",";
}

NR == 1 {
    for(i=1; i<=NF; i++) {
        if($i=="interesting") {
            col=i;
        }
    }
}

NR > 1 {
  print $col;
}

ghoti · Accepted Answer

You're pretty darn close already. Just use FNR instead of NR, for "File NR".

#!/usr/bin/awk -f

BEGIN { FS="," }

FNR==1 {
  for (col=1;col<=NF;col++)
    if ($col=="interesting")
      next
}

{ print $col }

Or if you like:

#!/usr/bin/awk -f

BEGIN { FS="," }

FNR==1 { for (col=1;$col!="interesting";col++); next }

{ print $col }

Or if you prefer one-liners:

$ awk -F, -v txt="interesting" 'FNR==1{for(c=1;$c!=txt;c++);next} {print $c}' file1 file2

Of course, be careful that you actually have the specified column, or you may find yourself in an endless loop. You can probably figure out the extra condition that saves you from that risk.

Note that in awk, you only need to terminate commands with semicolons if they are followed by another command. Thus, you would do this:

command1; command2

But you can drop the semicolon if you separate commands with newlines:

command1
command2

awk: extract data from a column by name rather than position

Answers (2)

Related Questions