Search and print specific columns from tab delimited file?

Question

I can use awk to print the nth column from a file; the cut command also can do a similar thing.. but I require the column to be taken based on its name, for example:

col1 col2 col3 col4
2 5 3 1
6 4 7 1 
3 6 5 9
7 9 7 8

and if I give a list of column names as input: e.g. col1, col3 (is is going to be a long list of column names, so it would help if the input could be an array)

the output would be

col1 col3
2 3
6 7 
3 5
7 7

does anyone know how I might do this in bash?

John1024 · Accepted Answer

$ awk -v s="col1 col3" 'BEGIN{split(s,v," ");for (i=1;i<=length(v);i++)a[v[i]]=1} NR==1{split($0,b,"	")} {for (i=1;i<=NF;i++)if (b[i] in a)printf "%s	",$i;print""}' file
col1    col3
2       3
6       7
3       5
7       7

How it works

-v s="col1 col3"

Define an awk variable s containing a space-separated list of the columns that you want to keep.
BEGIN{split(s,v," ");for (i=1;i<=length(v);i++)a[v[i]]=1}

Create an associative array a whose keys are the column names and whose values are one for columns in the string s.
NR==1{split($0,b," ")}

Save the columns names in an associative array b.
for (i=1;i<=NF;i++) if (b[i] in a) printf "%s ",$i; print""

For each column, i, if the column name, b[i] is in array a, print the column followed by a tab.

To finish, print "" prints a newline.

Search and print specific columns from tab delimited file?

Answers (2)

How it works

Related Questions