Reputation: 970
I have a data frame and there are 3 numeric variables there. I need to calculate some parameters of these numeric variables like mean, median, std, kurtosis. And then I need to arrange this in a data frame. So, first column of this data frame will contain all numeric variable names and second column will contain all mean values, third column will contain all median values and so on. How can this be achieved ? I am familiar with dplyr
package. So any suggestions ?
Upvotes: 1
Views: 1200
Reputation: 389175
You can use summarise
with across
:
library(dplyr)
library(tidyr)
mtcars %>%
select(1:3) %>%
summarise(across(where(is.numeric), list(mean = mean, std = sd, med = median)))
# mpg_mean mpg_std mpg_med cyl_mean cyl_std cyl_med disp_mean disp_std disp_med
#1 20.09062 6.026948 19.2 6.1875 1.785922 6 230.7219 123.9387 196.3
In the older version of dplyr
, you can use summarise_if
:
mtcars %>%
select(1:3) %>%
summarise_if(is.numeric, list(mean = mean, std = sd, med = median))
You can add pivot_longer
to above answer to get data in required format.
mtcars %>%
select(1:3) %>%
summarise(across(where(is.numeric),list(mean=mean,std=sd,med = median))) %>%
pivot_longer(cols = everything(),
names_to = c('col', '.value'),
names_sep = '_')
# A tibble: 3 x 4
# col mean std med
# <chr> <dbl> <dbl> <dbl>
#1 mpg 20.1 6.03 19.2
#2 cyl 6.19 1.79 6
#3 disp 231. 124. 196.
Or you can first pivot and then do the calculation :
mtcars %>%
select(1:3) %>%
pivot_longer(cols = everything()) %>%
group_by(name) %>%
summarise(mean = mean(value), std = sd(value), med = median(value))
Upvotes: 3