Reputation: 358
I need to figure how to reorder in bash a text file , made with another script, composed by lots of lines with a specific scheme. Here's 12 of them.
>NODE_3
nucleotide_cov: 170.3683
GC_CONT: 37.00
>NODE_18
nucleotide_cov: 168.8670
GC_CONT: 37.00
>NODE_23
nucleotide_cov: 178.0648
GC_CONT: 35.00
>NODE_41
nucleotide_cov: 174.4054
GC_CONT: 36.00
The output needed is:
GC_CONT: 37.00 nucleotide_cov: 170.3683 >NODE_3
GC_CONT: 37.00 nucleotide_cov: 168.8670 >NODE_18
GC_CONT: 35.00 nucleotide_cov: 178.0648 >NODE_23
GC_CONT: 36.00 nucleotide_cov: 174.4054 >NODE_41
where every column is divided by tab, so "\t" and GC_CONT needs to be the first value of them. awk solutions preferred.
EDIT
I'll try to more clear. Here's the output file by using
awk 'NR%3{printf "%s ",$0;next;}1' input.txt
>NODE_3 nucleotide_cov: 170.3683 GC_CONT: 37.00
>NODE_18 nucleotide_cov: 168.8670 GC_CONT: 37.00
>NODE_23 nucleotide_cov: 178.0648 GC_CONT: 35.00
>NODE_41 nucleotide_cov: 174.4054 GC_CONT: 36.00
Good, but i need to format them in order to have "GC_CONT:" at the beginning of every line.
Upvotes: 1
Views: 986
Reputation: 58488
This might work for you (GNU sed):
sed -r ':a;N;/\n>/!s/(.*)\n(.*)/\2\t\1/;ta;P;D' file
Collect the lines needed for a record, swapping the lines and replacing the newlines by tabs.
Upvotes: 1
Reputation: 19325
Just for information as tagged sed
:
sed -r '/^>/{N;N;s/(.*)\n(.*)\n(.*)/\3\t\2\t\1/g}'
Upvotes: 2
Reputation: 92884
Chhose one you like:
simple awk
one-liner:
awk '/^>/{ n=NR; r=$0; next }{ r=$0 OFS r; if (NR-n==2) print r }' OFS='\t' input.txt
Or awk
solution for strict order of lines:
awk '/^>/{ r1=$0; n=NR }
n{ if (NR == n+1) r2=$0; else if (NR == n+2) print $0, r2, r1 }' OFS='\t' input.txt
The output:
GC_CONT: 37.00 nucleotide_cov: 170.3683 >NODE_3
GC_CONT: 37.00 nucleotide_cov: 168.8670 >NODE_18
GC_CONT: 35.00 nucleotide_cov: 178.0648 >NODE_23
GC_CONT: 36.00 nucleotide_cov: 174.4054 >NODE_41
Upvotes: 2
Reputation: 6073
Try this awk script:
/^>/ {node=$0}
/^nucl/ {nucl=$0}
/^GC/ {print $0 "\t" nucl "\t" node}
Or, from command line:
awk '/^>/{node=$0} /^nucl/{nucl=$0} /^GC/{print $0 "\t" nucl "\t" node}' input_file
Upvotes: 2