Reputation: 4635
I would like to mutate
one of the columns of data frame depending on the certain conditions matched. I have looked the around but couldn't find some neat solution so far on this. use-mutate-to-create-new-column-label-with-conditions
So here is the simple data frame that I used
gr = rep(seq(1,2),each=3)
clas=c("A_1","A_2","A_3","A_4","A_5","A_6")
df <- data.frame(gr,clas)
> df
gr clas
1 1 A_1
2 1 A_2
3 1 A_3
4 2 A_4
5 2 A_5
6 2 A_6
I would like to chance A_4, A_5 and A_6 with B_1, B_2 and B_3
So I tried
match <- paste('_',seq(4,6),sep='')
df%>%
mutate(clas=ifelse(clas %in% match,paste('B',seq(1,3),sep='_'),clas))
gr clas
1 1 1
2 1 2
3 1 3
4 2 4
5 2 5
6 2 6
and 2nd try with grepl
df%>%
mutate(clas=ifelse(clas==grepl(paste(match,collapse='|'),clas),paste('B',seq(1,3),sep='_'),clas))
gr clas
1 1 1
2 1 2
3 1 3
4 2 4
5 2 5
6 2 6
Which is A's also gone :) The expected result is;
gr clas
1 1 A_1
2 1 A_2
3 1 A_3
4 2 B_1
5 2 B_2
6 2 B_3
Thanks!
EDIT: I realized that it is easier to do if there are LETTERS in the data clas
column. But if we have data like this and no gr
column how do we that ??
clas
1 CD_1
2 X.2_2
3 K$2_3
4 12k3_4
5 .A_5
6 xy_6
The expected output is
clas
1 CD_1
2 X.2_2
3 K$2_3
4 12kB_4
5 .B_5
6 xB_6
I guess I was looking for solution like that
Upvotes: 0
Views: 764
Reputation: 79228
I will try to use base R: specifically just to solve this problem:
First ensure your vector is in character form. I called the table above B
B[,1]=as.character(B[,1])
B[4:6,1]=sapply(B$clas[4:6],function(i) {substr(i,nchar(i)-2,nchar(i)-2)<-"B";i})
B
clas
1 CD_1
2 X.2_2
3 K$2_3
4 12kB_4
5 .B_5
6 xB_6
Upvotes: 1
Reputation: 323236
Here is dplyr
solution:
df%>%group_by(gr)%>%dplyr::mutate(clas=paste0(toupper(letters[gr]),"_",row_number()))
#you can change toupper(letters[gr]) to LETTERS[gr]
# A tibble: 6 x 2
# Groups: gr [2]
gr clas
<int> <chr>
1 1 A_1
2 1 A_2
3 1 A_3
4 2 B_1
5 2 B_2
6 2 B_3
Upvotes: 1
Reputation: 38500
Here's a base R solution that relies on df$gr
:
paste(LETTERS[df$gr], ave(df$gr, df$gr, FUN=seq_along), sep="_")
[1] "A_1" "A_2" "A_3" "B_1" "B_2" "B_3
LETTERS
are the Latin capital letters, LETTERS[1]
is "A". So "A" and "B" are pasted to the results of the running count constructed by seq_along
which is reset by group using ave
. These two are pasted together with "_" as the separator.
Upvotes: 1