Reputation: 187
I have a dataframe that's similar to what's below:
num <- c(1, 2, 3, 4)
name <- c("A", "B", "C", "A")
df <- cbind(num, name)
I'm looking to essentially turn this into:
num <- c(1, 2, 3, 4)
name <- c("A1", "B", "C", "A2")
df <- cbind(num, name)
How would I do this automatically, since my actual data is much larger?
Upvotes: 0
Views: 61
Reputation: 226871
It might be worth considering the built-in make.unique()
, although it doesn't do exactly what the OP wants (it doesn't label the first duplicated value, so that it can be run multiple times in succession). A little bit of extra trickiness is also required since name
is a factor:
df <- data.frame(num = c(1, 2, 3, 4),
name = c("A", "B", "C", "A"))
df <- transform(df, name=factor(make.unique(
as.character(name),sep="")))
## num name
## 1 1 A
## 2 2 B
## 3 3 C
## 4 4 A1
Upvotes: 1
Reputation: 672
Puginablanket,
See below for two solutions, one using the plyr
package and the other using base R's by
and do.call
functions.
eg <- data.frame(num = c(1, 2, 3, 4, 5),
name = c("A", "B", "C", "A", "B"),
stringsAsFactors = FALSE)
do.call(rbind, by(eg, eg$name, function(x) {
x$name2 <- paste0(x$name, 1:nrow(x))
x
}))
plyr::ddply(eg, "name", function(x) {
x$name2 <- paste0(x$name, 1:nrow(x))
x
})
Depending on your application, it might make sense to create a separate column which tracks this duplication (so that you're not using string parsing at a later step to pull it back apart).
Upvotes: 1
Reputation: 35324
Here's a one-line solution, assuming you really do have a data.frame rather than a matrix (a matrix is what is returned by your cbind()
command):
df <- data.frame(num=1:4, name=c('A','B','C','A') );
transform(df,name=paste0(name,ave(c(name),name,FUN=function(x) if (length(x) > 1) seq_along(x) else '')));
## num name
## 1 1 A1
## 2 2 B
## 3 3 C
## 4 4 A2
Upvotes: 0
Reputation: 2413
I converted your matrix to a dataframe
df <- data.frame(num, name)
#Get duplicat names
ext <- as.numeric(ave(as.character(df$name) , df$name,
FUN=function(x) cumsum(duplicated(x))+1))
nms <- df$name[ext > 1]
#add into data
df$newname <- ifelse( df$name %in% nms, paste0(df$name, ext), as.character(df$name))
Upvotes: 0