Reputation: 31

Replace column by comparing with the other column

I have a file like this

1      CC     AAA   

1      Na    AAA

1      Na    AAA

1      Na    AAA

1      Na    AAA

1      CC    BBB

1     Na    BBB

1     Na    BBB

1     xa    BBB

1     CC    CCC

1     Na    CCC

1     da    CCC

I would like to remove the column 2 and then replce with "01" for AAA, "02" for BBB and so on for entire file. Finally the output should looks like,

1     01    AAA 

1     01    AAA

1     01    AAA

1     01    AAA

1     01    AAA

1     02    BBB

1     02    BBB

1     02    BBB

1     02    BBB

1     03    CCC

1     03    CCC

1     03    CCC

I dont have any clue to make this working. Please help me if possible. Here in every cc the new variable starts. that is from AAA to BBB can be track by only CC in 2nd column.

Upvotes: 0

Answers (3)

William Pursell

Reputation: 212198

Seems like you want:

awk '$2=="CC" { a+=1 } {$2=sprintf("%02d",a)} 1' input

Upvotes: 0

Steve

Reputation: 54392

Here's one way using awk:

awk '$3 != r { ++i } { $2 = sprintf ("%02d", i) } { r = $3 }1' OFS="\t" file

I've set the OFS to a tab-char, but you can choose what you like. Results:

1   01  AAA
1   01  AAA
1   01  AAA
1   01  AAA
1   01  AAA
1   02  BBB
1   02  BBB
1   02  BBB
1   02  BBB
1   03  CCC
1   03  CCC
1   03  CCC

Upvotes: 1

jaypal singh

Reputation: 77065

One way of doing it in awk:

awk '$3!=a&&NF{a=$3;x=sprintf("%02d",++x);print $1,x,$3;next}$3==a&&NF{print $1,x,$3;next }1' inputFile

Upvotes: 2

Replace column by comparing with the other column

Answers (3)

Related Questions