Danielle
Danielle

Reputation: 795

calculating quantiles for subsets of data

I have a data frame like so:

set.seed(2)    
df<- data.frame(region= c(rep(1,4), rep(2,4)),scale=sample(-1:4,8,replace=TRUE))

I want to group the data by region and then calculate quantiles for that subset and put the results for the 25% quantile, 75% quantile in a separate columns.

quantile(df[1:4,2])[2] #region 1 25% quantile
quantile(df[1:4,2])[4] #region 1 75% quantile
quantile(df[5:8,2])[2] #region 2 25% quantile
quantile(df[5:8,2])[4] #region 2 75% quantile

The expected output would be:

output<- data.frame( region= c(1,2), Q1= c(0,2.75), Q3= c(2.25, 4))

I have tried:

out <- 
bos %>%
group_by(region)%>%
summarise(mean=mean(res_vec), sd= sd(res_vec), 
median=median(res_vec), mode= mode(res_vec),
        quantile1= quantile(scale, probs= 0.25),
        quantile2= quantile(scale, probs= 0.75))

AND  

quantiles<-aggregate(x = bos, by = list(bos$scale), fun = quantiles)

Upvotes: 1

Views: 326

Answers (1)

Daniel Anderson
Daniel Anderson

Reputation: 2424

This what you're looking for?

library(dplyr)

df %>% 
  group_by(region) %>% 
  summarize(q1 = quantile(scale, 0.25),
            q3 = quantile(scale, 0.75))

# A tibble: 2 x 3
  region    q1    q3
   <dbl> <dbl> <dbl>
1      1  0.00  2.25
2      2  2.75  4.00

Upvotes: 2

Related Questions