Summarize variables beside

Question

I am looking for a solution for my problem. I just can solve it with manually rearranging.

Example code:

  library(dplyr)

    set.seed(1)
    Data <- data.frame(
      W = sample(1:10),
      X = sample(1:10),
      Y = sample(c("yes", "no"), 10, replace = TRUE),
      Z = sample(c("cat", "dog"), 10, replace = TRUE)
    )        
    #
    summarized <- Data %>% group_by(Z) %>% summarise_if(is.numeric,funs(mean,median),na.rm=T)

print(Data)

I want the output looks like below, with each function applied to the first col and then and each function applied to the second col and so on. My code does it vice versa.

Of course I could rearrange the cols but that is not what Data Science is about. I have hundreds of cols and want to apply multiple functions.

This is what I want:

summarized <- summarized[,c(1,2,4,3,5)] #best solution yet

Is there any argument I am missing? I bet there is an easy solution or an other function does the job. Guys, thx in advance!

akrun · Accepted Answer

One option would be to post-process with adequate select_helpers

library(dplyr)
summarized %>% 
    select(Z, starts_with('W'), everything())
# A tibble: 2 x 5
#  Z     W_mean W_median X_mean X_median
#              
#1 cat     5.25      5.5   3.75      3.5
#2 dog     5.67      5.5   6.67      7

If there are 100s of columns, one approach is to get the substring of the column names, and order

library(stringr)
summarized %>% 
         select(Z, order(str_remove(names(.), "_.*")))
# A tibble: 2 x 5
#  Z     W_mean W_median X_mean X_median
#              
#1 cat     5.25      5.5   3.75      3.5
#2 dog     5.67      5.5   6.67      7

Summarize variables beside

Answers (2)

Related Questions