Reputation: 3427
i have a data frame like this
df <- data.frame(groupx=c("k1","k1","k2","k4","k3","k2"),x1=rep(1,6),x2=rep(2,6),
x3=rep(3,6),y1=rep(4,6),x12=rep(5,6))
and for each duplicate row in groups, i wanna modify the several related columns by prefixing the number with 'a'
i'm currently doing it like this and am quite sure it's not the most efficient method:
df[duplicated(df$groupx),"x1"]=paste0("a",df[duplicated(df$groupx),"x1"])
df[duplicated(df$groupx),"x2"]=paste0("a",df[duplicated(df$groupx),"x2"])
df[duplicated(df$groupx),"x3"]=paste0("a",df[duplicated(df$groupx),"x3"])
The desired output is to have "a" in front of corresponding rows of col x1,x2,x3 but not other columns.
Any recommendations? Thanks
Edit: sorry for the misunderstanding. Groupx row are not related with col names, the earlier example was a coincidence
Upvotes: 0
Views: 56
Reputation: 887223
xCols <- intersect(df$groupx, colnames(df))
df[, xCols] <- lapply(df[,xCols], function(x) {indx <- duplicated(df$groupx)
x[indx] <-paste0("a", x[indx]); x })
df
# groupx x1 x2 x3 y1 x12
#1 x1 1 2 3 4 5
#2 x1 a1 a2 a3 4 5
#3 x2 1 2 3 4 5
#4 x4 1 2 3 4 5
#5 x3 1 2 3 4 5
#6 x2 a1 a2 a3 4 5
Or
m1 <- as.matrix(df[,xCols])
indx <- duplicated(df[,1])[row(df[,xCols])]
m1[indx] <- paste0("a", m1[indx])
df[,xCols] <- m1
Upvotes: 2
Reputation: 92292
I'm assuming that you want to update only the columns that appear in df$groupx
, so here's a possible solution
indx <- grep(paste0("^", paste(unique(df$groupx), collapse = "$|^"), "$"), names(df))
df[duplicated(df$groupx), indx] <- paste0("a", as.matrix(df[duplicated(df$groupx), indx]))
df
# groupx x1 x2 x3 y1 x12
# 1 x1 1 2 3 4 5
# 2 x1 a1 a2 a3 4 5
# 3 x2 1 2 3 4 5
# 4 x4 1 2 3 4 5
# 5 x3 1 2 3 4 5
# 6 x2 a1 a2 a3 4 5
Upvotes: 2