Reputation: 5021

How to parse only selected column values using awk

I have a sample flat file which contains the following block

test my array which array is better array huh got it?

INDIA USA SA NZ AUS ARG ARM ARZ GER BRA SPN

I also have an array(ksh_arr2) which was defined like this

ksh_arr2=$(awk '{if(NR==1){for(i=1;i<=NF;i++){if($i~/^arr/){print i}}}}' testUnix.txt)

and contains the following integers

3 5 8

Now I want to parse only those column values which are at the respective numbered positions i.e. third fifth and eighth. I also want the outputs from the 2nd line on wards. So I tried the following

awk '{for(i=1;i<=NF;i++){if(NR >=1 && i=${ksh_arr2[i]}) do print$i ; done}}' testUnix.txt

but it is apparently not printing the desired outputs. What am I missing ? Please help.

Upvotes: 0

Answers (3)

John1024

Reputation: 113814

Since no sample output is shown, I don't know if this output is what you want. It is the output one gets from the code provided with the minimal changes required to get it to run:

$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<=length(a);i++){print $a[i]}}' testUnix.txt 
array
array
array



SA
AUS
ARZ

The above code prints out the selected columns in the same order supplied by the variable k.

Notes

The awk code never defined ksh_arr2. I presume that the value of this array was to be passed in from the shell. It is done here using the -v option to set the variable k to the value of ksh_arr2.
It is not possible to pass into awk an array directly. It is possible to pass in a string, as above, and then convert it to an array using the split function. Above the string k is converted to the awk array a.
awk syntax is different from shell syntax. For instance, awk does not use do or done.

Details

-v k='3 5 8'

This defines an awk variable k. To do this programmatically, replace 3 5 8 with a string or array from the shell.
BEGIN{split(k,a," ");}

This converts the space-separated values in variable k into an array named a.
for(i=1;i<=length(a);i++){print $a[i]}

This prints out each column in array a in order.

Alternate Output

If you want to keep the output from each line on a single line:

$ awk -v k='3 5 8' 'BEGIN{split(k,a," ");} {for(i=1;i<length(a);i++) printf "%s ",$a[i]; print $a[length(a)]}' testUnix.txt 
array array array

SA AUS ARZ

Upvotes: 1

user3442743

Reputation:

How i would approach it

awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{for(i in B)print $B[i]}' file

Explanation

 -vA="${ksh_arr2[*]}"     -    Set variable A to expanded ksh array

  'BEGIN{split(A,B," ")   -    Splits the expanded array on spaces
                               (effictively recreating it in awk)

  {for(i in B)print $B[i]} -  Index in the new array print the field that is the number 
                              contained in that index

Edit

If you want to preserve the order of the fields when printing then this would be better

awk -vA="${ksh_arr2[*]}" 'BEGIN{split(A,B," ")}{while(++i<=length(B))print $B[i]}' file

Upvotes: 2

Jdamian

Reputation: 3115

awk 'NR>=1 { print $3 " " $5 " " $8 }' testUnix.txt