mountain_view
mountain_view

Reputation: 11

R: repeated calculation of means for a number of grouping variables

I have a data frame (df1) with several dummy variables (group_1 … group_i) and two main variables (main_variable_1 and main_variable_2).

From df1 I want to create a new data frame (df2) that contains the names of the dummy variable in the first column and one additional column each with the means of the two main variables where group_i = 1.

I have tried iteration with for loops and map functions, but am struggling with iteration over column names.

Does someone have a solution for this?

Upvotes: 1

Views: 21

Answers (1)

danlooo
danlooo

Reputation: 10627

library(tidyverse)

df1 <- tribble(
  ~group_1, ~group_2, ~group_3, ~main_1, ~main_2,
  1, 0, 0, 5, 6,
  1, 0, 0, 50, 60,
  0, 1, 0, 7, 7
)

df1 %>%
  pivot_longer(starts_with("group_")) %>%
  filter(value == 1) %>%
  select(-value) %>%
  group_by(name) %>%
  summarise(
    main_1 = mean(main_1),
    main_2 = mean(main_2)
  )
#> # A tibble: 2 × 3
#>   name    main_1 main_2
#>   <chr>    <dbl>  <dbl>
#> 1 group_1   27.5     33
#> 2 group_2    7        7

Created on 2022-04-25 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions