Alex
Alex

Reputation: 163

How automatically assign the boxplot chart limits from the given the dataset

I am using this data set to construct the following chart inflation chart. In this chart I assigned manually the limits to the Y axis of the chart:

coord_cartesian(ylim=c(-20,80))

Now, I'd like to assign these limits via code (not manually) so that with new given dataset the chart has limits that nicely captures the quantile boxes with their vertical lines, while ignoring the outliers that are depicted as grey dots on the chart. Thanks!

P.S. The data set imf.inflation :

      country_iso_code year  value
X2003              AFG 2003 35.663
X2004              AFG 2004 16.358
X2005              AFG 2005 10.569
X2006              AFG 2006  6.785
X2007              AFG 2007  8.681
X2008              AFG 2008 26.419

My code:

r <- .005
imf.inflation_trcd <- imf.inflation %>% 
   group_by(year) %>% 
   filter(value <= quantile(value, c(r, 1-r)))

p1 <- ggplot() + 
   geom_boxplot(data = imf.inflation, aes(x = year, y = value, group = year), 
       outlier.shape = 1, outlier.color = "grey") +
   geom_smooth(data = imf.inflation_trcd, aes(x=year, y=value), 
       method = "loess", se=TRUE, color = "orange") +
   coord_cartesian(ylim=c(-20,80)) +
   labs(x = "", y = "Average Yearly Inflation (% year-on-year)", 
       title = "Distribution of Inflation*",
       subtitle = "Over 1980-2020 and across IMF member countries", 
       caption = paste0("* Smoothing line is an estimation using LOESS method and based upon truncated dataset (",
                   (1-r)*100, "% percentile). Source: IMF.")) +
   theme(axis.title.y = element_text(size = 9),
      legend.title = element_blank(), 
      legend.background=element_blank(), legend.position = "right", 
      plot.caption = element_text(hjust = 0, size = 8))

Upvotes: 1

Views: 115

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226557

Strategy: (1) run boxplot() (from base R) with an appropriate formula, save the result. (2) Extract the first and last (fifth) rows of the $stats element of the result, which are the bottom/top of the whiskers; find the min and max (see the details of the return value in ?boxplot). (3) put these values into your coord_cartesian() call.

dd <- read.csv("Inflation.csv")
bb <- boxplot(value~year, data=dd)
br <- c(min(bb$stats[1,]), max(bb$stats[5,]))
ggplot(dd, aes(year,value, group=year)) +
    geom_boxplot() +
    coord_cartesian(ylim=br)

Upvotes: 1

Related Questions