rajomato
rajomato

Reputation: 1177

ggplot hide blank space with no x-values

Is there an efficient way to hide blank spaces in timeseries plots using ggplot2? I've got the following graph which, as can be seen, has no data from Dec. 3 - Dec. 5. Is there a way to hide this portion of the graph?

enter image description here

I'm currently using following code to produce this graph:

ggplot(data = do.call(rbind.data.frame, combinedOutput[,2])) +
  geom_line(aes(x = Date, y = Return)) +
  geom_line(aes(x = Date, y = PredReturn), colour = "red") +
  facet_wrap(~Ticker, ncol = 2, scales = "free") +
  theme_light() + 
  theme(panel.spacing.y = unit(0.3, "cm"), 
        strip.background = element_rect(fill = "white"), 
        strip.text = element_text(colour = "black")) + 
  labs(x = NULL, y = "Daily Return in \\%")

This is how the raw data looks like. There are no NAs between 2016-12-02 16:00:00 and 2016-12-05 09:30:00.

enter image description here

Many thanks in advance!

Upvotes: 0

Views: 1483

Answers (1)

Z.Lin
Z.Lin

Reputation: 29125

I see this as first & foremost a data wrangling problem, with the ggplot part coming afterwards.

Since there's no sample data in the question, let's simulate some:

library(dplyr)

set.seed(12345)
data <- data.frame(
  Date = seq.POSIXt(from = ISOdate(2018, 1, 1),
                    to = ISOdate(2018, 5, 1),
                    by = "hour")
) %>%
  mutate(Return = rnorm(n = n()),
         PredReturn = rnorm(n = n()))
data$Date[c(220:350,
            593:820,
            2100:2500)] <- NA
data <- na.omit(data)

#which creates a dataset with 3 distinctive gaps in its time periods
ggplot(data,
       aes(x = Date, group = 1)) +
  geom_line(aes(y = Return)) +
  geom_line(aes(y = PredReturn), color = "red") +
  theme_light()

plot with time gaps

We can identify time gaps by comparing the time difference between consecutive time stamps. Here, the logic I used defines a gap as any time difference larger than the median of all time differences. You may want to change that to some other value (e.g. 2 days? 1 week?) depending on your context:

data2 <- data %>%
  arrange(Date) %>%
  mutate(date.diff = c(NA, diff(Date))) %>%
  mutate(is.gap = !is.na(date.diff) & date.diff > median(date.diff, na.rm = TRUE)) %>%
  mutate(period.id = cumsum(is.gap))

> head(data2)
                 Date     Return PredReturn date.diff is.gap period.id
1 2018-01-01 12:00:00  0.5855288 -0.7943254        NA  FALSE         0
2 2018-01-01 13:00:00  0.7094660  1.8875074         1  FALSE         0
3 2018-01-01 14:00:00 -0.1093033  0.5881879         1  FALSE         0
4 2018-01-01 15:00:00 -0.4534972  1.1556793         1  FALSE         0
5 2018-01-01 16:00:00  0.6058875 -0.8743878         1  FALSE         0
6 2018-01-01 17:00:00 -1.8179560  0.2586568         1  FALSE         0

Now each period.id value corresponds to a subset of data without major time differences within its rows. We can further wrangle this data by converting it to long format:

data2 <- data2 %>%
  select(-date.diff, -is.gap) %>% # drop unneeded columns
  tidyr::gather(color, y, -Date, -period.id) %>%
  mutate(color = factor(color,
                        levels = c("Return", "PredReturn")))

> head(data2)
                 Date period.id  color          y
1 2018-01-01 12:00:00         0 Return  0.5855288
2 2018-01-01 13:00:00         0 Return  0.7094660
3 2018-01-01 14:00:00         0 Return -0.1093033
4 2018-01-01 15:00:00         0 Return -0.4534972
5 2018-01-01 16:00:00         0 Return  0.6058875
6 2018-01-01 17:00:00         0 Return -1.8179560

Pass this data to ggplot(), facet by time periods with free scales, & you'd have eliminated the blank spaces from the earlier plot above:

p <- ggplot(data2,
       aes(x = Date, y = y, color = color)) +
  geom_line() +
  facet_grid(~ period.id, scales = "free_x", space = "free_x") +
  scale_color_manual(values = c("Return" = "black",
                                "PredReturn" = "red")) +
  theme_light()

p

faceted plot

Further tweaks to the plot's aesthetics can hide the blank spaces completely, though I'd caution against going to extremes without making the time gaps very clear to your intended audience, as this can be subject to misinterpretation:

p +
  scale_x_datetime(expand = c(0, 0),             # remove space within each panel
                   breaks = "5 days") +          # specify desired time breaks
  theme(panel.spacing = unit(0, "pt"),           # remove space between panels
        axis.text.x = element_text(angle = 90))  # rotate x-axis text

faceted plot without gaps

Upvotes: 1

Related Questions