Reputation: 1221
My question is a variation of Question asked here. I have a data frame with duplicate (repeating) values in Column2 something as follows:
df <- read.table(text='Column1 Column2
1 A
2 B
3 C
4 B
5 B
6 A
7 C
8 D ', header=TRUE)
Duplicate values do not follow any sequence. I want to rename duplicate column values so as to distinguish among them. Any variation will be OK. But all those values that are unique (as 'D' above is) should remain as they are. For example transformed column values can be as:
Column1 Column2
1 A1
2 B2
3 C1
4 B3
5 B4
6 A2
7 C2
8 D
Or can also be as:
Column1 Column2
1 Ax
2 Bx
3 Cx
4 By
5 Bz
6 Ay
7 Cy
8 D
where x, y and z are any digits or literals (even A.x or A_x are OK).
I have tried the following solution but while it does rename duplicate values, for unique column values it leaves numbers.
n<-transform(df, Column.new = ifelse(duplicated(Column2) | duplicated(Column2, fromLast=TRUE),paste(Column2,seq_along(Column2), sep="") , Column2))
The result is:
Column1 Column2 Column.new
1 1 A A1
2 2 B B2
3 3 C C3
4 4 B B4
5 5 B B5
6 6 A A6
7 7 C C7
8 8 D 4
Value 'D' (last row) should have remained as it is instead of getting substituted by '4' in 'Column.new'.
I shall be grateful for a solution.
Upvotes: 2
Views: 2672
Reputation: 24945
using dplyr
:
library(dplyr)
df %>% group_by(Column2) %>%
mutate(new2 = if(n( ) > 1) {paste0(Column2, row_number( ))}
else {paste0(Column2)})
Upvotes: 3