Reputation: 1059
File Example
I have a 3-10 amount of files with:
- different number of columns
- same number of rows
- inconsistent spacing (sometimes one space, other tabs, sometimes many spaces) **within** the very files like the below
> 0 55.4 9.556E+09 33
> 1 1.3 5.345E+03 1
> ........
> 33 134.4 5.345E+04 932
>
........
I need to get column (say) 1 from file1, column 3 from file2, column 7 from file3 and column 1 from file4 and combine them into a single file, side by side.
Trial 1: not working
paste <(cut -d[see below] -f1 file1) <(cut -d[see below] -f3 file2) [...]
where the delimiter was ' ' or empty.
Trial 2: working with 2 files but not with many files
awk '{ a1=$1;b1=$4; getline <"D2/file1.txt"; print a1,$1,b1,$4 }' D1/file1.txt >D3/file1.txt
Now more general question:
How can I extract different columns from many different files?
Upvotes: 6
Views: 33289
Reputation: 67929
In your paste
/ cut
attempt, replace cut
by awk
:
$ paste <(awk '{print $1}' file1 ) <(awk '{print $3}' file2 ) <(awk '{print $7}' file3) <(awk '{print $1}' file4)
Upvotes: 21
Reputation: 54592
Assuming each of your files has the same number of rows, here's one way using GNU awk
. Run like:
awk -f script.awk file1.txt file2.txt file3.txt file4.txt
Contents of script.awk
:
FILENAME == ARGV[1] { one[FNR]=$1 }
FILENAME == ARGV[2] { two[FNR]=$3 }
FILENAME == ARGV[3] { three[FNR]=$7 }
FILENAME == ARGV[4] { four[FNR]=$1 }
END {
for (i=1; i<=length(one); i++) {
print one[i], two[i], three[i], four[i]
}
}
Note:
By default, awk
separates columns on whitespace. This includes tab characters and spaces, and any amount of these. This makes awk
ideal for files with inconsistent spacing. You can also expand the above code to include more files if you wish.
Upvotes: 8
Reputation:
The combination of cut
and paste
should work:
$ cat f1
foo
bar
baz
$ cat f2
1 2 3
4 5 6
7 8 9
$ cat f3
a b c d
e f g h
i j k l
$ paste -d' ' <(cut -f1 f1) <(cut -d' ' -f2 f2) <(cut -d' ' -f3 f3)
foo 2 c
bar 5 g
baz 8 k
Edit: This works with tabs, too:
$ cat f4
a b c d
e f g h
i j k l
$ paste -d' ' <(cut -f1 f1) <(cut -d' ' -f2 f2) <(cut -f3 f4)
foo 2 c
bar 5 g
baz 8 k
Upvotes: 1