Alexander
Alexander

Reputation: 4635

mutate new column that matches to defined strings

I would like to mutate one of the columns of data frame depending on the certain conditions matched. I have looked the around but couldn't find some neat solution so far on this. use-mutate-to-create-new-column-label-with-conditions

So here is the simple data frame that I used

gr = rep(seq(1,2),each=3)
clas=c("A_1","A_2","A_3","A_4","A_5","A_6")

df <- data.frame(gr,clas)

> df
  gr clas
1  1  A_1
2  1  A_2
3  1  A_3
4  2  A_4
5  2  A_5
6  2  A_6

I would like to chance A_4, A_5 and A_6 with B_1, B_2 and B_3

So I tried

match <- paste('_',seq(4,6),sep='')
 df%>%
  mutate(clas=ifelse(clas %in% match,paste('B',seq(1,3),sep='_'),clas))

       gr clas
    1  1    1
    2  1    2
    3  1    3
    4  2    4
    5  2    5
    6  2    6

and 2nd try with grepl

df%>%
mutate(clas=ifelse(clas==grepl(paste(match,collapse='|'),clas),paste('B',seq(1,3),sep='_'),clas))

   gr clas
1  1    1
2  1    2
3  1    3
4  2    4
5  2    5
6  2    6

Which is A's also gone :) The expected result is;

   gr clas
1  1  A_1
2  1  A_2
3  1  A_3
4  2  B_1
5  2  B_2
6  2  B_3

Thanks!

EDIT: I realized that it is easier to do if there are LETTERS in the data clas column. But if we have data like this and no gr column how do we that ??

    clas
1   CD_1
2  X.2_2
3  K$2_3
4 12k3_4
5   .A_5
6   xy_6

The expected output is

    clas
1   CD_1
2  X.2_2
3  K$2_3
4 12kB_4
5   .B_5
6   xB_6

I guess I was looking for solution like that

Upvotes: 0

Views: 764

Answers (3)

Onyambu
Onyambu

Reputation: 79228

I will try to use base R: specifically just to solve this problem:

First ensure your vector is in character form. I called the table above B

  B[,1]=as.character(B[,1])
  B[4:6,1]=sapply(B$clas[4:6],function(i) {substr(i,nchar(i)-2,nchar(i)-2)<-"B";i})
  B
     clas
 1   CD_1
 2  X.2_2
 3  K$2_3
 4 12kB_4
 5   .B_5
 6   xB_6

Upvotes: 1

BENY
BENY

Reputation: 323236

Here is dplyr solution:

df%>%group_by(gr)%>%dplyr::mutate(clas=paste0(toupper(letters[gr]),"_",row_number()))
#you can change toupper(letters[gr]) to LETTERS[gr]

# A tibble: 6 x 2
# Groups:   gr [2]
     gr  clas
  <int> <chr>
1     1   A_1
2     1   A_2
3     1   A_3
4     2   B_1
5     2   B_2
6     2   B_3

Upvotes: 1

lmo
lmo

Reputation: 38500

Here's a base R solution that relies on df$gr:

paste(LETTERS[df$gr], ave(df$gr, df$gr, FUN=seq_along), sep="_")
[1] "A_1" "A_2" "A_3" "B_1" "B_2" "B_3

LETTERS are the Latin capital letters, LETTERS[1] is "A". So "A" and "B" are pasted to the results of the running count constructed by seq_along which is reset by group using ave. These two are pasted together with "_" as the separator.

Upvotes: 1

Related Questions