Reputation: 21
I have a file with 5 columns that looks like this:
15642 G A.aa,, 0.77501 107
15643 G A.a,.A, 0.7570 17
15644 C t.TtTt,.T, 0.7501 10
I'm trying to convert the 3rd column of Aa's and Tt's to just "A" or "T". Output:
15642 G A 0.77501 107
15643 G A 0.7570 17
15644 C T 0.7501 10
I've tried various awk methods without success. I'd sincerely appreciate any help. Thanks!
Upvotes: 1
Views: 44
Reputation: 58478
This might work for you (GNU sed):
sed -ri 's/(\S)\S*/\U\1/3' file
Convert the first character of the third field to uppercase.
Upvotes: 0
Reputation: 204164
There's many possibilities including:
$ awk '{sub(/\..*/,"",$3)} 1' file
15642 G A 0.77501 107
15643 G A 0.7570 17
15644 C t 0.7501 10
or
$ awk '{$3=substr($3,1,1)} 1' file
15642 G A 0.77501 107
15643 G A 0.7570 17
15644 C t 0.7501 10
or
$ awk '{$3=toupper(substr($3,1,1))} 1' file
15642 G A 0.77501 107
15643 G A 0.7570 17
15644 C T 0.7501 10
Upvotes: 1
Reputation: 133650
Following awk
may help you on same.
awk '$3~/[Aa]/{$3="A"} $3~/[Tt]/{$3="T"} 1' Input_file
Upvotes: 1