Reputation: 1179
I have looking for but not found how make a simple if for many columns in dplyr.
I have this code (it works):
library(dplyr)
library(magrittr)
data("PlantGrowth")
PlantGrowth %>% mutate (
a=if_else(group=="ctrl", weight*2, weight*100),
b=if_else(group=="ctrl", weight*1,5, weight/100),
c=if_else(group=="ctrl", weight*4, weight*100),
d=if_else(group=="ctrl", weight*5, weight/1000)
)
And I would like to not repeat the condition. Something like that:
PlantGrowth %>% mutate_if_foo (
group=="ctrl",{
a=weight*2,
b=weight*1,5,
c=weight*4,
d=weight*5
}
)%>% mutate_if_foo (
group!="ctrl",{
a=weight*100,
b=weight/100),
c=weight*100),
d=weight/1000)
}
)
I've found many answers on mutate_if
,mutate_all
, mutate_at
, case_when
but they don't answer at my question.
Please with dplyr / tidyverse.
Thanks in advance
EDIT
I've tried, from @Rohit_das idea about functions.
mtcars %>% ( function(df) {
if (df$am==1){
df%>% mutate(
a=df$mpg*3,
b=df$cyl*10)
}else{
df%>% mutate(
a=df$disp*300,
d=df$cyl*1000)
}
})
but I have Warning message:
In if (df$am == 1) { :
the condition has length > 1
and only the first element will be used
Upvotes: 3
Views: 132
Reputation: 1179
I think I've found an answer. I tested on mtcars
. I didn't test yet on my real code.
Comment please if I you think I am wrong in the concept.
The conditions of the filters have to be exclusives else I will take duplicate lines.
library(dplyr)
library(magrittr)
library(tibble) # only if necessary to preserve rownames
mtcars %>% ( function(df) {
rbind(
(df
%>% tibble::rownames_to_column(.) %>%tibble::rowid_to_column(.) # to preserve rownames
%>%dplyr::filter(am==1)
%>%dplyr::mutate(
a=mpg*3,
b=cyl*10,d=NA)),
(df
%>% tibble::rownames_to_column(.) %>%tibble::rowid_to_column(.) # to preserve rownames
%>%dplyr::filter(am!=1)
%>%dplyr::mutate(
a=disp*3,
d=cyl*100,b=NA))
)
}) %>%arrange(rowid)
Upvotes: 0
Reputation: 1950
I think I found a neat solution with purrr. It takes a data frame of inputs and then dynamically names new columns a:d
with new inputs for each column. First column will use x = 2
, y = 100
and z = "a"
and then the next row, and so on. The cool thing with functional programming like this is that it is very easy to scale up.
library(tidyverse)
iterate <- tibble(x = c(2, 1.5, 4, 5),
y = c(100, 1/100, 100, 1/1000),
z = c("a", "b", "c", "d"))
fun <- function(x, y, z) {
PlantGrowth %>%
mutate(!!z := if_else(group == "ctrl", weight * x, weight * y)) %>%
select(3)
}
PlantGrowth %>%
bind_cols(
pmap_dfc(iterate, fun)
) %>%
as_tibble
Which gives you the same df:
# A tibble: 30 x 6
weight group a b c d
<dbl> <fct> <dbl> <dbl> <dbl> <dbl>
1 4.17 ctrl 8.34 6.26 16.7 20.8
2 5.58 ctrl 11.2 8.37 22.3 27.9
3 5.18 ctrl 10.4 7.77 20.7 25.9
4 6.11 ctrl 12.2 9.17 24.4 30.6
5 4.5 ctrl 9 6.75 18 22.5
Upvotes: 0
Reputation: 2032
Not sure I understand the issue here. If you just want to reduce the verbosity of the code then just create a custom function
customif = function(x,y) {
if_else(group=="ctrl", weight*x, weight*y)
}
then you can call this function in your mutate as
PlantGrowth %>% mutate (
a=customif(2,100),
b=customif(1,5, 1/100),
c=customif(4, 100),
d=customif(5, 1/1000)
)
Upvotes: 1