Alex Shpenev
Alex Shpenev

Reputation: 35

Mapping pipes to multiple columns in tidyverse

I'm working with a table for which I need to count the number of rows satisfying some criterion and I ended up with basically multiple repetitions of the same pipe differing only in the variable name.

Say I want to know how many cars are better than Valiant in mtcars on each of the variables there. An example of the code with two variables is below:

library(tidyverse)

reference <- mtcars %>% 
     slice(6)

mpg <- mtcars  %>% 
  filter(mpg > reference$mpg) %>%
  count() %>% 
  pull()

cyl <- mtcars  %>% 
  filter(cyl > reference$cyl) %>%
  count() %>% 
  pull()

tibble(mpg, cyl)

Except, suppose I need to do it for like 100 variables so there must be a more optimal way to just repeat the process.

What would be the way to rewrite the code above in an optimal way (maybe, using map() or anything else that works with pipes nicely so that the result would be a tibble with the counts for all the variables in mtcars?

I feel the solution should be very easy but I'm stuck. Thank you!

Upvotes: 2

Views: 135

Answers (3)

r.user.05apr
r.user.05apr

Reputation: 5456

Or:

library(tidyverse)

map_dfc(mtcars, ~sum(.x[6] < .x))

map2_dfc(mtcars, reference, ~sum(.y < .x))

Upvotes: 2

Darren Tsai
Darren Tsai

Reputation: 35554

You could use summarise + across to count observations greater than a certain value in each column.

library(dplyr)

mtcars %>%
  summarise(across(everything(), ~ sum(. > .[6])))

#   mpg cyl disp hp drat wt qsec vs am gear carb
# 1  18  14   15 22   30 11    1  0 13   17   25

  • base solution:
# (1)
colSums(mtcars > mtcars[rep(6, nrow(mtcars)), ])

# (2)
colSums(sweep(as.matrix(mtcars), 2, mtcars[6, ], ">"))

# mpg  cyl disp   hp drat   wt qsec   vs   am gear carb
#  18   14   15   22   30   11    1    0   13   17   25

Upvotes: 2

Elias
Elias

Reputation: 736

You can do it in a loop for example. Like this:

library(tidyverse)

reference <- mtcars %>% 
  slice(6)

# Empty list to save outcome
list_outcome <- list()

# Get the columnnames to loop over
loop_var <- colnames(reference)
for(i in loop_var){
  nr <- mtcars  %>% 
    filter(mtcars[, i] > reference[, i]) %>%
    count() %>% 
    pull()
  # Save every iteration in the loop as the ith element of the list
  list_outcome[[i]] <- data.frame(Variable = i, Value = nr)
}

# combine all the data frames in the list to one final data frame
df_result <- do.call(rbind, list_outcome)

Upvotes: 1

Related Questions