Mike A.
Mike A.

Reputation: 398

Extracting columns from multiple files into a single output file from the command line

Say I have a tab-delimited data file with 10 columns. With awk, it's easy to extract column 7, for example, and output that into a separate file. (See this question, for example.)

What if I have 5 such data files, and I would like to extract column 7 from each of them and make a new file with 5 data columns, one for the column 7 of each input file? Can this be done from the command line with awk and other commands?

Or should I just write up a Python script to handle it?

Upvotes: 1

Views: 1293

Answers (2)

nu11p01n73R
nu11p01n73R

Reputation: 26667

awk '{a[FNR] = a[FNR]" " $7}END{for(i=0;i<FNR;i++) print a[i]}'

a array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

END{for(i=0;i<FNR;i++) print a[i]} prints the content of array a on END of file

Upvotes: 1

Etan Reisner
Etan Reisner

Reputation: 80921

If the data is small enough to store it all in memory then this should work:

awk '{out[FNR]=out[FNR] (out[FNR]?OFS:"") $7; max=(FNR>max)?FNR:max} END {for (i=1; i<=max; i++) {print out[i]}}' file1 file2 file3 file4 file5

If it isn't then you would need something fancier which could seek around file streams or read single lines from multiple files (a shell loop with N calls to read could do this).

Upvotes: 0

Related Questions