Reputation: 5
I am trying to get the mean, sd, min, max, and range for the mpg, price, weight, and repair record grouped by two factor levels (domestic and foreign) within a variable called foreign. I've come across many examples that show how to get one statistic like mean on multiple variables or how to get multiple statistics for one variable grouped by two factor levels. However, I haven't found anything particularly useful for developing the table that I've descibed above.
I've tried many things and it appears that ddply
might be what I should be using. I think it should be something like ddply(df,[column I want to use as factor level], mean=mean(value),...
but am unsure of the syntax. Thanks for any help!
Upvotes: 0
Views: 1107
Reputation: 6264
I would favour a tidyverse
approach, such as:
library(tibble)
library(dplyr)
mtcars %>%
rownames_to_column() %>%
as_tibble() %>%
group_by(rowname) %>%
summarise_all(
funs(mean = mean, median = median, min = min, max = max, sd = sd)
)
# # A tibble: 32 x 56
# rowname mpg_mean cyl_mean disp_mean hp_mean drat_mean wt_mean qsec_mean
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30
# 2 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98
# 3 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41
# 4 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42
# 5 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61
# 6 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87
# 7 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84
# 8 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50
# 9 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47
# 10 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90
...or using summarise_if
with the is.numeric
predicate
library(dplyr)
starwars %>%
group_by(homeworld) %>%
summarise_if(
is.numeric,
funs(mean = mean, median = median, min = min, max = max, sd = sd)
)
# # A tibble: 49 x 16
# homeworld height_mean mass_mean birth_year_mean height_median mass_median birth_year_median height_min
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Alderaan 176.3333 NA NA 188 NA NA 150
# 2 Aleen Minor 79.0000 15.0 NA 79 15.0 NA 79
# 3 Bespin 175.0000 79.0 37 175 79.0 37 175
# 4 Bestine IV 180.0000 110.0 NA 180 110.0 NA 180
# 5 Cato Neimoidia 191.0000 90.0 NA 191 90.0 NA 191
# 6 Cerea 198.0000 82.0 92 198 82.0 92 198
# 7 Champala 196.0000 NA NA 196 NA NA 196
# 8 Chandrila 150.0000 NA 48 150 NA 48 150
# 9 Concord Dawn 183.0000 79.0 66 183 79.0 66 183
# 10 Corellia 175.0000 78.5 25 175 78.5 25 170
...you can always add arguments to the functions if necessary, such as na.rm like this mean(., na.rm = TRUE)
Upvotes: 1