Reputation: 149
For my file which looks like this:
AABBCC 10 5 CCAABB 100
BBCCAA 4 2 AABBCC 50
CCAABB 16 8 BBCCAA 20
... I am trying to sort columns 4 and 5, by matching column 4 to column 1.
Ideally this would return:
AABBCC 10 5 AABBCC 50
BBCCAA 4 2 BBCCAA 20
CCAABB 16 8 CCAABB 100
I have tried using sort, however as far as I'm aware it doesn't have a utility to sort within files.
Any help would be greatly appreciated!
Upvotes: 0
Views: 1322
Reputation: 198436
join -t $'\t' -o 1.1,1.2,1.3,2.1,2.2 <(cut -f1-3 file.tsv | sort -k 1,1) <(cut -f4- file.tsv | sort -k 1,1) | sort
Cut the original file, then join on the first field of both. We need to specify the full join fields in -o
to preserve the first column, or join
will eat it.
Upvotes: 2
Reputation: 92874
awk
solution:
awk 'NR==FNR{ a[$4]=$5; next }$1 in a{ print $1,$2,$3,$1,a[$1] }' file1 OFS="\t" file1
The output:
AABBCC 10 5 AABBCC 50
BBCCAA 4 2 BBCCAA 20
CCAABB 16 8 CCAABB 100
You may pipe to sort
at the end: ... | sort
Upvotes: 3
Reputation: 88776
With bash and GNU paste:
With temporary files for illustration:
cut -f 1-3 file | sort > file_1to3
cut -f 4-5 file | sort > file_4to5
paste -d '\t' file_1to3 file_4to5
Without temporary files:
paste -d '\t' <(cut -f 1-3 file | sort) <(cut -f 4-5 file | sort)
Output:
AABBCC 10 5 AABBCC 50 BBCCAA 4 2 BBCCAA 20 CCAABB 16 8 CCAABB 100
Upvotes: 2