Reputation: 300
I have a nested dataframe like this:
df<-mpg %>% group_by(manufacturer) %>% nest()
And I have to apply this function to clean some values:
func<-function(drv,cty){
values<-which(drv=="f"& cty<20)
cty[values]<-0
cty}
Only on these manufacturers
manufacturers_vector<-c("audi","chevrolet","jeep")
There is any way I can apply the function only if the manufacturer column in my df matches a value in manufacturers_vector?
Upvotes: 2
Views: 73
Reputation: 887088
We filter
the data and then use map
to loop over the list
'data'
library(dplyr)
library(purrr)
library(ggplot2)
df2 <- df %>%
filter(manufacturer %in% manufacturers_vector) %>%
mutate(out = map(data, ~ func(.x$drv, .x$cty)))
-output
df2
# A tibble: 3 x 3
# Groups: manufacturer [3]
# manufacturer data out
# <chr> <list> <list>
#1 audi <tibble [18 × 10]> <dbl [18]>
#2 chevrolet <tibble [19 × 10]> <dbl [19]>
#3 jeep <tibble [8 × 10]> <dbl [8]>
-out column output
df2$out
#[[1]]
# [1] 0 21 20 21 0 0 0 18 16 20 19 15 17 17 15 15 17 16
#[[2]]
# [1] 14 11 14 13 12 16 15 16 15 15 14 11 11 14 0 22 0 0 0
#[[3]]
#[1] 17 15 15 14 9 14 13 11
If we want to keep the original data as such without filter
, then use map_if
df %>%
mutate(out = map_if(data, .f = ~ func(.x$drv, .x$cty),
.p = manufacturer %in% manufacturers_vector, .else = ~ NA_real_))
-output
# A tibble: 15 x 3
# Groups: manufacturer [15]
# manufacturer data out
# <chr> <list> <list>
# 1 audi <tibble [18 × 10]> <dbl [18]>
# 2 chevrolet <tibble [19 × 10]> <dbl [19]>
# 3 dodge <tibble [37 × 10]> <dbl [1]>
# 4 ford <tibble [25 × 10]> <dbl [1]>
# 5 honda <tibble [9 × 10]> <dbl [1]>
# 6 hyundai <tibble [14 × 10]> <dbl [1]>
# 7 jeep <tibble [8 × 10]> <dbl [8]>
# 8 land rover <tibble [4 × 10]> <dbl [1]>
# 9 lincoln <tibble [3 × 10]> <dbl [1]>
#10 mercury <tibble [4 × 10]> <dbl [1]>
#11 nissan <tibble [13 × 10]> <dbl [1]>
#12 pontiac <tibble [5 × 10]> <dbl [1]>
#13 subaru <tibble [14 × 10]> <dbl [1]>
#14 toyota <tibble [34 × 10]> <dbl [1]>
#15 volkswagen <tibble [27 × 10]> <dbl [1]>
Upvotes: 2