Reputation: 351
Is there a better way (such as a one liner in AWK) where I can get the column number in a table with headings from a column name? I want to be able to process a column independent of what the column number actually is (such as when another column is added the script will not need to change).
For example, given the following table in "table.tsv":
ID Value Target Not Used
1 5 9 11
2 4 8 12
3 6 7 10
I can do a sort on the "Target" column using:
#!/bin/bash
(IFS=$'\t'; read -r; printf "%s\n" "$REPLY"; i=0; for col in $REPLY; do
((++i))
[ "$col" == "Target" ] && break
done; sort -t$'\t' "-k$i,${i}n") < table.tsv
Is there a way to do it without the for loop (or at least clean it up a little)?
The expected output of the given script is:
ID Value Target Not Used
3 6 7 10
2 4 8 12
1 5 9 11
However, I was trying to give an example of what I was trying to do. I want to pass/filter my table through several programs so the headings and all columns should be preserved: just have processing occur at each step. In pseudo code, what I would like to do is:
print headings from stdin
i=$(magic to determine column position given "Target")
sort -t$'\t' "-k$i,${i}n" # or whatever processing is required on that column
Upvotes: 7
Views: 13576
Reputation: 67507
another alternative with a lot of pipes
$ head -1 table | tr -s ' ' '\n' | nl -nln | grep "Target" | cut -f1
extract first row, transpose, number lines, find column name, extract number
Or, awk
to the rescue!
$ awk -v RS='\t' '/Target/{print NR; exit}' file.tsv
3
Upvotes: 11
Reputation: 785581
Here is an awk alternative:
awk -F '\t' -v col='Target' 'NR==1{for (i=1; i<=NF; i++) if ($i == col){c=i; break}}
{print $c}' file
EDIT: To print column number only:
awk -F '\t' -v col='Target' 'NR==1{for (i=1; i<=NF; i++) if ($i==col) {print i;exit}}' file
3
Upvotes: 6
Reputation: 204015
$ awk -v name='Target' '{for (i=1;i<=NF;i++) if ($i==name) print i; exit}' file
3
Upvotes: 3