Garcher
Garcher

Reputation: 79

How to have a nice formatted x axis in ggplot with Year-Month-Day of the week data in R

I'm working on a Shiny app, where one of the options is aggregating data by Year-Month-Day of the week.

library(ggplot2)
library(dplyr)

pnd <- data.frame(c(rep('MEXICALI',900),rep('SALTILLO',900) ),sample(200:1600, 1800, T),sample(200:1600, 1800, T),rep(seq.POSIXt(from = as.POSIXct(Sys.Date()-90), length.out = 900, by = "1 hour"),2))

colnames(pnd) <- c('zona_carga', 'PrecioMDA', 'PrecioMTR', 'ID')

pnd <- pnd %>% select(ID, zona_carga,PrecioMDA, PrecioMTR)  %>%
  mutate(ID = format(ID, '%Y-%m %a')  ) %>% group_by( ID, zona_carga) %>% summarise(PrecioMDA = mean(PrecioMDA), PrecioMTR = mean(PrecioMTR)) 

colors <- c('MEXICALI - PrecioMDA' = 'steelblue', 'SALTILLO - PrecioMTR' = 'magenta')

ggplot(pnd, aes(x = ID) ) + 
  geom_line(data = filter(pnd, zona_carga == 'MEXICALI'), aes(y = as.numeric(PrecioMDA),group='PrecioMDA', color = paste('MEXICALI','-','PrecioMDA')) ) + 
  geom_line(data = filter(pnd, zona_carga == 'SALTILLO'), aes(y = as.numeric(PrecioMTR),group='PrecioMTR', color = paste('SALTILLO','-','PrecioMTR') ))  +
  labs(y='$MXN/MWh',x='Fecha',color = 'legend') + scale_color_manual(values = colors) + 
  scale_x_discrete( )

The problem is that, when the date interval increase, labels start to get mixed. Is there any way to specify dynamic breaks in my x axis? something similar to scale_x_date(breaks = '1 day')

Upvotes: 2

Views: 1035

Answers (2)

eipi10
eipi10

Reputation: 93791

You could wrap the x-axis labels, so their width is smaller. Note in the code below, I've streamlined the use of geom_line and the colour mapping and I've also set the order of the data to follow the order of the dates. I'm not sure if that's what you wanted, but the ordering in your example didn't seem correct.

set.seed(958)
pnd <- data.frame(c(rep('MEXICALI',900),rep('SALTILLO',900) ),sample(200:1600, 1800, T),sample(200:1600, 1800, T),rep(seq.POSIXt(from = as.POSIXct(Sys.Date()-90), length.out = 900, by = "1 hour"),2))

colnames(pnd) <- c('zona_carga', 'PrecioMDA', 'PrecioMTR', 'ID')

pnd <- pnd %>% 
  select(ID, zona_carga,PrecioMDA, PrecioMTR)  %>%
  arrange(ID) %>% 
  mutate(ID = format(ID, '%Y-%m %a'),
         ID = factor(ID, levels=unique(ID))) %>% 
  group_by( ID, zona_carga) %>% 
  summarise(PrecioMDA = mean(PrecioMDA), PrecioMTR = mean(PrecioMTR)) 

colors <- c('MEXICALI - PrecioMDA' = 'steelblue', 'SALTILLO - PrecioMDA' = 'magenta')

ggplot(pnd, aes(x = ID, y=PrecioMDA, 
                group=paste0(zona_carga, " - PrecioMDA"),
                colour=paste0(zona_carga, " - PrecioMDA"))) + 
  geom_line() +
  labs(y='$MXN/MWh',x='Fecha',color = 'legend') + 
  scale_color_manual(values = colors) + 
  scale_x_discrete(labels=function(x) c(rbind(x[seq(1,length(x), 2)], rep(" ", ceiling(length(x)/2))))[1:length(x)]) +
  theme_classic() +
  theme(legend.position="bottom")

enter image description here

If you want to alternate x-axis labels (as suggested by @RonakShah) another option is to keep all the tick marks but remove the text from every other one:

ggplot(pnd, aes(x = ID, y=PrecioMDA, 
                group=paste0(zona_carga, " - PrecioMDA"),
                colour=paste0(zona_carga, " - PrecioMDA"))) + 
  geom_line() +
  labs(y='$MXN/MWh',x='Fecha',color = 'legend') + 
  scale_color_manual(values = colors) +  
  scale_x_discrete(labels=function(x) c(rbind(x[seq(1,length(x), 2)], rep(" ", ceiling(length(x)/2))))[1:length(x)]) +
  theme_classic() +
  theme(legend.position="bottom")

enter image description here

But does the "2020-04" etc. really need to be repeated. Another option might be to have the "2020-04" only on the first day for that month, and then just list the days of the week after that (and similarly for each new month).

Upvotes: 4

Ronak Shah
Ronak Shah

Reputation: 388982

Since the labels are repeated how about you show only alternate labels ?

library(ggplot2)

 ggplot(pnd, aes(x = ID) ) + 
   geom_line(data = filter(pnd, zona_carga == 'MEXICALI'), 
        aes(y = as.numeric(PrecioMDA),group='PrecioMDA', color = paste('MEXICALI','-','PrecioMDA')) ) + 
   geom_line(data = filter(pnd, zona_carga == 'SALTILLO'), 
        aes(y = as.numeric(PrecioMDA),group='PrecioMDA', color = paste('SALTILLO','-','PrecioMDA') ))  +
   labs(y='$MXN/MWh',x='Fecha',color = 'legend') + scale_color_manual(values = colors) + 
   scale_x_discrete(breaks = function(x) {x[c(TRUE, FALSE)] <- '';x})

enter image description here

If the dates still don't fit in the plot you can consider Rotating and spacing axis labels in ggplot2 .

Upvotes: 3

Related Questions