Tiernan
Tiernan

Reputation: 858

ggplot2: mixed formats in axis labels

I'm building a faceted line plot and would like to modify the axis labels. Specifically, I would like to eliminate visual redundancy by fully labeling the starting extent of each axis, and then using a shorthand label format for all following labels.

For the x-axis, I would like the labels to begin with the full year (%Y), and otherwise show only the decade (%y) preceded by an apostrophe. For the y-axis, I would like the upper extent label to be followed with a '%' symbol, and all other labels to just show the number.

Here's the inspiration behind this post (note the careful alignment of the y-axis labels):

line plot from fivethirtyeight.com

My plot is faceted, although I don't think that should matter for this question:

df <- 
        tribble(
        ~YEAR, ~FPL_100PCT, ~RENT,     ~GEOG, ~ALL,
        2015,       .270,  .223, 'Seattle',   .152,
        2014,       .212,  .225, 'Seattle',   .148,
        2013,       .181,  .217, 'Seattle',   .145,
        2012,       .166,  .199, 'Seattle',   .126,
        2011,       .236,  .238, 'Seattle',   .145,
        2010,       .241,  .249, 'Seattle',   .156,
        2009,       .246,  .247, 'Seattle',   .141,
        2008,       .187,  .216, 'Seattle',   .139,
        2007,       .226,  .232, 'Seattle',   .142,
        2006,       .233,  .249, 'Seattle',   .155,
        2015,       .200,  .210,      'KC',   .122,
        2014,       .186,  .207,      'KC',   .118,
        2013,       .201,  .215,      'KC',   .124,
        2012,       .189,  .209,      'KC',   .116,
        2011,       .208,  .230,      'KC',   .119,
        2010,       .207,  .233,      'KC',   .126,
        2009,       .206,  .244,      'KC',   .121,
        2008,       .198,  .226,      'KC',   .116,
        2007,       .210,  .229,      'KC',   .120,
        2006,       .226,  .232,      'KC',   .126

) %>% 
        mutate(YEAR = as.Date(as.character(YEAR),format = "%Y"),
               YEAR = floor_date(YEAR, unit = 'year'))%>% 
        gather(TBL,ESTIMATE,FPL_100PCT, RENT, ALL)

gg <- ggplot(data = df, aes(x = YEAR, y = ESTIMATE, color = TBL))
gg <- gg + geom_line()
gg <- gg + scale_y_continuous(breaks = seq(0,.5,.1),labels = scales::percent(seq(0,.5,.1)),limits = c(0,.5))
gg <- gg + scale_x_date(date_breaks =  '1 year', date_labels = '%y')
gg <- gg + facet_grid(. ~GEOG)
gg <- gg + theme_minimal()
gg <- gg + theme(axis.title = element_blank(),
                 legend.title = element_blank(),
                 panel.grid.minor.x = element_blank())
gg <- gg + labs(title = 'Mobility')

my plot so far

Thanks!

Upvotes: 1

Views: 611

Answers (2)

Tiernan
Tiernan

Reputation: 858

Building on @Jake Kaupp's answer, the mixed labels can be expressed as functions like so:

library(magrittr)
library(stringr)

label_pct <- function(breaks){
        b <- breaks
        b_max <- max(b) %>% scales::percent()
        b_others <- b[(!(b %in% max(b)))]*100 %>% as.integer()
        b_final <- c(b_others,b_max)
        return(b_final)
}

label_yr <- function(breaks){
        b <- breaks
        b_min <- min(b)
        b_others <- b[(!(b %in% b_min))] %>% str_sub(3,4) %>% paste0("'",.)
        b_final <- c(b_min,b_others)
        return(b_final)
}

Now they can be inserted into the code that Jake provided and reused in other plots

gg <- ggplot(data = df, aes(x = YEAR, y = ESTIMATE, color = TBL))
gg <- gg + geom_line()
gg <- gg + scale_y_continuous(breaks = seq(0,.5,.1),labels = label_pct(seq(0,.5,.1)),limits = c(0,.5))
gg <- gg + scale_x_continuous(breaks = seq(2007,2015), labels = label_yr(seq(2007,2015)))
gg <- gg + facet_grid(. ~GEOG)
gg <- gg + theme_minimal()
gg <- gg + theme(axis.title = element_blank(),
                 legend.title = element_blank(),
                 panel.grid.minor.x = element_blank())
gg <- gg + labs(title = 'Mobility')

The only outstanding issue is that the digits in the y-axis label don't align nicely with the numeric digits of the top label, but I'm going to call this "good enough" for now.

Upvotes: 1

Jake Kaupp
Jake Kaupp

Reputation: 8072

You can simply work out the needed breaks ahead of time. I removed the sections formatting the YEAR column, as I find it easier to work with continuous rather than date scales.

gg <- ggplot(data = df, aes(x = YEAR, y = ESTIMATE, color = TBL))
gg <- gg + geom_line()
gg <- gg + scale_y_continuous(breaks = seq(0,.5,.1),labels = c(0,10,20,30,40,"50%"),limits = c(0,.5))
gg <- gg + scale_x_continuous(breaks = seq(2007,2015), labels = c(2007, "08","09","10","11","12","13","14","15"))
gg <- gg + facet_grid(. ~GEOG)
gg <- gg + theme_minimal()
gg <- gg + theme(axis.title = element_blank(),
                 legend.title = element_blank(),
                 panel.grid.minor.x = element_blank())
gg <- gg + labs(title = 'Mobility')

enter image description here

You could easily extend this into a function that would construct the labels from the data and construct the plot.

Upvotes: 1

Related Questions