Aggregating variables for cases

Question

Hello fellow Overflowers,

the goal is to process certain steps of data manipulation on a fairy big dataset. In a first step, certain variables, which represent different cases of a certain information, shall be aggregated for each case. There are always 5 variables to aggregate.

Right now, the dataset looks like this:

      a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ... xyz5 A B C 
case1 3  4  7  9  6  21 13              4    1 7 8 
case2 9  12 8        17 25 31           7    2 7 6
case3 5  3  11 10    32 19 13           5    1 6 8
...

It should somehow look like this

      mean-a  mean-b ...mean-xyz A B C 
case1 5,8     17        6,4      1 7 8 
case2 9,6     24,3      8,3      2 7 6
case3 7,25    21,3      7        1 6 8
...

I'm not sure if building a function or using the acrossfunction from the dplyr package is the right way to do it, since there are about 2000 variables which need to be aggregated.

Any help will be greatly appreciated.

Thanks a lot in advance!

Anoushiravan R · Accepted Answer

You can also use the following solution:

library(dplyr)
library(stringr)
library(purrr)

# First we extract the unique letter values of column names
letters <- unique(str_remove(names(df), "\d"))
[1] "a" "b"   
  

# Then we iterate over each unique values and extract the columns that contain that unique letter

letters %>%
  map(~ df %>% 
        select(contains(.x)) %>% 
        rowwise() %>%
        mutate("mean_{.x}" := mean(c_across(contains(.x)), na.rm = TRUE))) %>%
  bind_cols() %>%
  relocate(contains("mean"), .after = last_col())


# A tibble: 3 x 12
# Rowwise: 
     a1    a2    a3    a4    a5    b1    b2    b3    b4    b5 mean_a mean_b
               
1     3     4     7     9     6    21    13     7     8     4    5.8   10.6
2     9    12     8    17    25    31     4     2     2     7   14.2    9.2
3     5     3    11    10    32    19    13     2     2     5   12.2    8.2

Data

df <- tribble(
  ~a1, ~a2, ~a3, ~a4, ~a5, ~b1, ~b2, ~b3, ~b4, ~b5,
 3, 4, 7, 9, 6, 21, 13, 7, 8, 4, 
 9, 12, 8, 17, 25, 31, 4, 2, 2, 7,
 5, 3, 11, 10, 32, 19, 13, 2, 2, 5
)

Aggregating variables for cases

Answers (2)

Example Data:

Determine Groups:

Aggregate

Related Questions