Reputation: 31
I have a file like this
1 CC AAA
1 Na AAA
1 Na AAA
1 Na AAA
1 Na AAA
1 CC BBB
1 Na BBB
1 Na BBB
1 xa BBB
1 CC CCC
1 Na CCC
1 da CCC
I would like to remove the column 2 and then replce with "01"
for AAA
, "02"
for BBB
and so on for entire file. Finally the output should looks like,
1 01 AAA
1 01 AAA
1 01 AAA
1 01 AAA
1 01 AAA
1 02 BBB
1 02 BBB
1 02 BBB
1 02 BBB
1 03 CCC
1 03 CCC
1 03 CCC
I dont have any clue to make this working. Please help me if possible. Here in every cc the new variable starts. that is from AAA
to BBB
can be track by only CC in 2nd column.
Upvotes: 0
Views: 82
Reputation: 212198
Seems like you want:
awk '$2=="CC" { a+=1 } {$2=sprintf("%02d",a)} 1' input
Upvotes: 0
Reputation: 54392
Here's one way using awk
:
awk '$3 != r { ++i } { $2 = sprintf ("%02d", i) } { r = $3 }1' OFS="\t" file
I've set the OFS to a tab-char, but you can choose what you like. Results:
1 01 AAA
1 01 AAA
1 01 AAA
1 01 AAA
1 01 AAA
1 02 BBB
1 02 BBB
1 02 BBB
1 02 BBB
1 03 CCC
1 03 CCC
1 03 CCC
Upvotes: 1
Reputation: 77065
One way of doing it in awk
:
awk '$3!=a&&NF{a=$3;x=sprintf("%02d",++x);print $1,x,$3;next}$3==a&&NF{print $1,x,$3;next }1' inputFile
Upvotes: 2