Reputation: 51
I have a dataset that gives me the fuel economy on the highway, among other variables for 4 and 6 cylinder cars. I tried using the group_by function which is not working ( code to follow)
I have installed and deployed / called the dplyr package but it throws an error
Error in group_by(., Cylinders) : could not find function "group_by"
Cars_filtered %>% group_by(Cylinders) %>% summarise(Min = min(Economy_highway,na.rm = TRUE),
Q1 = quantile(Economy_highway,probs = .25,na.rm = TRUE),
Median = median(Economy_highway, na.rm = TRUE),
Q3 = quantile(Economy_highway,probs = .75,na.rm = TRUE),
Max = max(Economy_highway,na.rm = TRUE),
Mean = mean(Economy_highway, na.rm = TRUE),
SD = sd(Economy_highway, na.rm = TRUE),
n = n(),
Missing = sum(is.na(price))
I want to see the summary descriptive stats for highway fuel economy for the different 4 and 6 cylinder cars .
Is there some other way to go about it?
Upvotes: 0
Views: 277
Reputation: 2239
for this it would be sufficient to use tapply
.
Using the mtcars
data set, let's say you are interested in the summary stats of MPG grouped by gear
, you can use:
tapply(mtcars$mpg,mtcars$gear, summary)
If you only want to retrieve the summary stats for gear "3" and "4", you can use add
tapply(mtcars$mpg,mtcars$gear, summary)[c("3", "4")]
In case you want to add e.g. the standard deviation or sample size to the default summary output, you could define your own summary function:
smmry <- function(x) c(summary(x), sd = sd(x), n = length(x))
tapply(mtcars$mpg,mtcars$gear, smmry)
Upvotes: 1