axel_p
axel_p

Reputation: 51

How do i get the summary descriptive statistics for a quantitative variable over a qualitative variable?

I have a dataset that gives me the fuel economy on the highway, among other variables for 4 and 6 cylinder cars. I tried using the group_by function which is not working ( code to follow)

I have installed and deployed / called the dplyr package but it throws an error

Error in group_by(., Cylinders) : could not find function "group_by"

Cars_filtered %>% group_by(Cylinders) %>% summarise(Min = min(Economy_highway,na.rm = TRUE),
                                                    Q1 = quantile(Economy_highway,probs = .25,na.rm = TRUE),
                                                    Median = median(Economy_highway, na.rm = TRUE),
                                                    Q3 = quantile(Economy_highway,probs = .75,na.rm = TRUE),
                                                    Max = max(Economy_highway,na.rm = TRUE),
                                                    Mean = mean(Economy_highway, na.rm = TRUE),
                                                    SD = sd(Economy_highway, na.rm = TRUE),
                                                    n = n(),
                                                    Missing = sum(is.na(price))

I want to see the summary descriptive stats for highway fuel economy for the different 4 and 6 cylinder cars .

Is there some other way to go about it?

Upvotes: 0

Views: 277

Answers (1)

Daniel
Daniel

Reputation: 2239

for this it would be sufficient to use tapply.

Using the mtcars data set, let's say you are interested in the summary stats of MPG grouped by gear, you can use:

tapply(mtcars$mpg,mtcars$gear, summary)

If you only want to retrieve the summary stats for gear "3" and "4", you can use add

tapply(mtcars$mpg,mtcars$gear, summary)[c("3", "4")]

In case you want to add e.g. the standard deviation or sample size to the default summary output, you could define your own summary function:

smmry <- function(x) c(summary(x), sd = sd(x), n = length(x))

tapply(mtcars$mpg,mtcars$gear, smmry)

Upvotes: 1

Related Questions