Elio Campitelli
Elio Campitelli

Reputation: 1476

Dividing long time series in multiple panels with ggplot2

I have a rather long timeseries that I want to plot in ggplot, but it's sufficiently long that even using the full width of the page it's barely readable.

one time series in one panel

What I want to do instead is to divide the plot into 2 (or more, in the general case) panels one on top of each other.

I could do it manually but not only it's cumbersome but also it's hard to get the axis to have the same scale. Ideally I would like to have something like this:

ggplot(data, aes(time, y)) + 
  geom_line() +
  facet_time(time, n = 2)

And then get something like this:

one time series in multiple panesl

(This plot was made using facet_wrap(~(year(as.Date(time)) > 2000), ncol = 1, scales = "free_x"), which messes up x axis scale, it works only for 2 panels, and doesn't work well with geom_smooth())

Also, ideally it would also handle summary statistics correctly. For example, using the correct data for geom_smooth() (so facetting wouldn't do it, because at the beginning of every facet it would not use the data in the last chunk of the previous one).

Is there a way to do this?

Thank you!

Upvotes: 4

Views: 1721

Answers (2)

eipi10
eipi10

Reputation: 93891

Below I create two separate plots, one for the period 1982-1999 and one for 1999-2016 and then lay them out using grid.arrange from the gridExtra package. The horizontal axes are scaled equivalently in both plots.

I also generate regression lines outside of ggplot using the loess function so that it can be added using geom_line (you can of course use any regression function here, such as lm, gam, splines, etc). With this approach the regression can be run on the entire time series, ensuring continuity of the regression line across the two panels, even though we break the time series into two halves for plotting.

library(dplyr)      # For the chaining (%>%) operator
library(purrr)      # For the map function
library(gridExtra)  # For the grid.arrange function

Function to extract a legend from a ggplot. We'll use this to get one legend across two separate plots.

# http://stackoverflow.com/questions/12539348/ggplot-separate-legend-and-plot
g_legend<-function(a.gplot){
  tmp <- ggplot_gtable(ggplot_build(a.gplot))
  leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
  legend <- tmp$grobs[[leg]]
  legend
}

# Fake data
set.seed(255)
dat = data.frame(time=rep(seq(1982,2016,length.out=500),2),
                 value= c(arima.sim(list(ar=c(0.4, 0.05, 0.5)), n=500), 
                          arima.sim(list(ar=c(0.3, -0.3, 0.6)), n=500)),
                 group=rep(c("A","B"), each=500))

Generate smoother lines using loess: We want a separate regression line for each level of group, so we use group_by with the chaining operator from dplyr:

dat = dat %>% group_by(group) %>%
        mutate(smooth = predict(loess(value ~ time, span=0.1)))

Create a list of two plots, one for each time period: We use map to create separate plots for each time period and return a list with the two plot objects as elements (you can also use base lapply for this instead of map):

pl = map(list(c(1982,1999), c(1999,2016)), 
         ~ ggplot(dat %>% filter(time >= .x[1], time <= .x[2]), 
                  aes(colour=group)) +
             geom_line(aes(time, value), alpha=0.5) +
             geom_line(aes(time, smooth), size=1) + 
             scale_x_continuous(breaks=1982:2016, expand=c(0.01,0)) +
             scale_y_continuous(limits=range(dat$value)) +
             theme_bw() +
             labs(x="", y="", colour="") +
             theme(strip.background=element_blank(),
                   strip.text=element_blank(),
                   axis.title=element_blank()))


# Extract legend as a separate graphics object
leg = g_legend(pl[[1]])

Finally, we lay out both plots (after removing legends) plus the extracted legend:

grid.arrange(arrangeGrob(grobs=map(pl, function(p) p + guides(colour=FALSE)), ncol=1),
             leg, ncol=2, widths=c(10,1), left="Value", bottom="Year")

enter image description here

Upvotes: 3

user3603486
user3603486

Reputation:

You can do this by storing the plot object, then printing it twice. Each time add an option coord_cartesian:

orig_plot <- ggplot(data, aes(time, y)) + 
  geom_line() 

early <-  orig_plot + coord_cartesian(xlim = c(1982, 2000))
late  <-  orig_plot + coord_cartesian(xlim = c(2000, 2016))

That makes sure that both plots use all the data.

To plot them on the same page, use grid (I got this from the ggplot2 book, which is probably around as a pdf somewhere):

library(grid)
vp1 <- viewport(width = 1, height = .5, just = c("center", "bottom"))
vp2 <- viewport(width = 1, height = .5, just = c("center", "top"))
print(early, vp = vp1)
print(late, vp = vp2)

Upvotes: 3

Related Questions