Zombraz
Zombraz

Reputation: 373

Mutate only applies to the value of the first column

I'm trying to remove a substring from the values of a column, my data looks something like this:

LBL       Var1      Var2
name1       1        12
name1_A     1        13
name1_B     2        10
name2       1        11
name2_A     2        10
name2_B     3        9

I have already created a function that works on a single string, but when I try to mutate the data frame it shows the result of the first row and does not apply on the rest of the rows. What am I doing wrong?

This is the function that I have created so far:

remExt <- function(x){
  y <- str_split_fixed(x,"_",2)
  return(y[1])
}

I have tried the function with a single string and it works perfectly:

string1 <- "Yes_No"

res <- remExt(string1)
print(res)

[1] "Yes"

I try the mutate with the following instruction:

df %>% mutate(newLBL = remExt(df$LBL))

And I get the following result:

LBL       Var1      Var2   newLBL
name1       1        12    name1
name1_A     1        13    name1
name1_B     2        10    name1
name2       1        11    name1
name2_A     2        10    name1
name2_B     3        9     name1

My expected result is:

LBL       Var1      Var2   newLBL
name1       1        12    name1
name1_A     1        13    name1
name1_B     2        10    name1
name2       1        11    name2
name2_A     2        10    name2
name2_B     3        9     name2

But just can't seem to get it to work, any ideas?

Upvotes: 0

Views: 686

Answers (1)

astrofunkswag
astrofunkswag

Reputation: 2688

You're only taking the first element from y, you want the first column with y[,1]:

remExt <- function(x){
  y <- str_split_fixed(x,"_",2)
  return(y[,1])
}

df %>% mutate(newLBL = remExt(LBL))
      LBL Var1 Var2 newLBL
1   name1    1   12  name1
2 name1_A    1   13  name1
3 name1_B    2   10  name1
4   name2    1   11  name2
5 name2_A    2   10  name2
6 name2_B    3    9  name2

Also note with mutate you don't need to extract the column with $

EDIT:

A simpler implementation would be to use str_remove or str_replace. Both the following are equivalent:

df %>% mutate(newLBL = str_replace(LBL,"\\_.",""))

df %>% mutate(newLBL = str_remove(LBL,"\\_."))

Upvotes: 2

Related Questions