Reputation: 373
I'm trying to remove a substring from the values of a column, my data looks something like this:
LBL Var1 Var2
name1 1 12
name1_A 1 13
name1_B 2 10
name2 1 11
name2_A 2 10
name2_B 3 9
I have already created a function that works on a single string, but when I try to mutate the data frame it shows the result of the first row and does not apply on the rest of the rows. What am I doing wrong?
This is the function that I have created so far:
remExt <- function(x){
y <- str_split_fixed(x,"_",2)
return(y[1])
}
I have tried the function with a single string and it works perfectly:
string1 <- "Yes_No"
res <- remExt(string1)
print(res)
[1] "Yes"
I try the mutate with the following instruction:
df %>% mutate(newLBL = remExt(df$LBL))
And I get the following result:
LBL Var1 Var2 newLBL
name1 1 12 name1
name1_A 1 13 name1
name1_B 2 10 name1
name2 1 11 name1
name2_A 2 10 name1
name2_B 3 9 name1
My expected result is:
LBL Var1 Var2 newLBL
name1 1 12 name1
name1_A 1 13 name1
name1_B 2 10 name1
name2 1 11 name2
name2_A 2 10 name2
name2_B 3 9 name2
But just can't seem to get it to work, any ideas?
Upvotes: 0
Views: 686
Reputation: 2688
You're only taking the first element from y
, you want the first column with y[,1]
:
remExt <- function(x){
y <- str_split_fixed(x,"_",2)
return(y[,1])
}
df %>% mutate(newLBL = remExt(LBL))
LBL Var1 Var2 newLBL
1 name1 1 12 name1
2 name1_A 1 13 name1
3 name1_B 2 10 name1
4 name2 1 11 name2
5 name2_A 2 10 name2
6 name2_B 3 9 name2
Also note with mutate
you don't need to extract the column with $
EDIT:
A simpler implementation would be to use str_remove
or str_replace
. Both the following are equivalent:
df %>% mutate(newLBL = str_replace(LBL,"\\_.",""))
df %>% mutate(newLBL = str_remove(LBL,"\\_."))
Upvotes: 2