Reputation: 23
I have a data frame that has a timeseries data for different countries and different variables. Let us say there are two countries (UK, US) and two variables (GMS, PP) - for each country, I want to plot two timeseries one against the other one for each variable.
Meaning I want to have 2 plots with 2 subplots, i.e. UK will have two plots where I have a timeseries for GMS and PP (same for the US).
I also want to add a legend to the plots.
month marketplace value_fcst_1 value_fcst_2 variable
1 2019-05-26 US 4202393 4198816 GMS
2 2019-06-02 US 30504725 31525980 GMS
3 2019-06-09 US 30454694 30602385 GMS
4 2019-06-16 US 30249561 30363117 ALC
5 2019-06-23 US 30884821 31682497 ALC
6 2019-06-30 US 31424970 31198360 ALC
7 2019-05-26 UK 4202393 4198816 GMS
8 2019-06-02 UK 30504725 31525980 GMS
9 2019-06-09 UK 30454694 30602385 GMS
10 2019-06-16 UK 30249561 30363117 ALC
11 2019-06-23 UK 30884821 31682497 ALC
12 2019-06-30 UK 31424970 31198360 ALC
I managed to plot all of variables but not sure how to divide graphs for US and UK and how to adjust y-axis for each variable as the scale does not match (see photo).
series_plot <- ggplot(data = final_df) +
geom_line(aes(x = month, y = value_fcst_1), colour = 'dodgerblue2', na.rm = TRUE, show.legend = TRUE) +
geom_line(aes(x = month, y = value_fcst_2), colour = 'coral2', na.rm = TRUE, show.legend = TRUE) +
facet_wrap(vars(variable)) +
labs(x = 'Months') +
labs(title = 'Comparisons of two different forecast runs', subtitle = '2019-05-31 vs 2019-06-30 forecast runs')
# labs(name = 'Forecast Runs', fill = 'buu') +
# legend("test1","test2")
print(series_plot)
Upvotes: 2
Views: 360
Reputation: 160437
You free one or both scales in the facet_*
functions.
(Update: I think your recent comment suggests reshaping the data slightly ... scroll to the bottom for another way to look at it.)
Using your sample data, keep "x" the same but free "y":
ggplot(data = final_df) +
geom_line(aes(x = month, y = value_fcst_1), colour = 'dodgerblue2', na.rm = TRUE, show.legend = TRUE) +
geom_line(aes(x = month, y = value_fcst_2), colour = 'coral2', na.rm = TRUE, show.legend = TRUE) +
facet_wrap(vars(variable), scales="free_y") +
labs(x = 'Months') +
labs(title = 'Comparisons of two different forecast runs', subtitle = '2019-05-31 vs 2019-06-30 forecast runs')
Free both "x" and "y":
ggplot(data = final_df) +
geom_line(aes(x = month, y = value_fcst_1), colour = 'dodgerblue2', na.rm = TRUE, show.legend = TRUE) +
geom_line(aes(x = month, y = value_fcst_2), colour = 'coral2', na.rm = TRUE, show.legend = TRUE) +
facet_wrap(vars(variable), scales="free") +
labs(x = 'Months') +
labs(title = 'Comparisons of two different forecast runs', subtitle = '2019-05-31 vs 2019-06-30 forecast runs')
Update: the best way to "add a legend" based on when the forecast was run is to let ggplot2
do it for you. And to do that, you need it in a variable, not as a variable. Right now, you have value_fcst_1
as a variable, and value_fcst_2
as a variable. Let's reshape the data. I'm using dplyr
and tidyr
here, though there are base and data.table
methods as well.
library(dplyr) # and tidyr is used
final_df %>%
tidyr::gather(k, v, -month, -marketplace, -variable) %>%
slice(1:3, n() - 0:2) # just to show some sampling
# month marketplace variable k v
# 1 2019-05-26 US GMS value_fcst_1 4202393
# 2 2019-06-02 US GMS value_fcst_1 30504725
# 3 2019-06-09 US GMS value_fcst_1 30454694
# 4 2019-06-30 UK ALC value_fcst_2 31198360
# 5 2019-06-23 UK ALC value_fcst_2 31682497
# 6 2019-06-16 UK ALC value_fcst_2 30363117
This is putting the forecast run in a variable (named k
here). From here, it's easy enough to do
final_df %>%
tidyr::gather(k, v, -month, -marketplace, -variable) %>%
ggplot() +
geom_line(aes(x = month, y = v, color = k), na.rm = TRUE, show.legend = TRUE) +
facet_wrap(vars(variable), scales="free") +
labs(x = 'Months') +
labs(title = 'Comparisons of two different forecast runs', subtitle = '2019-05-31 vs 2019-06-30 forecast runs')
The k
is certainly ugly, but I kept it intentionally, as there are two easy fixes:
tidyr::gather("Forecast Run", v, ...)
, though this requires `Forecast Run`
(backticks!) as a variable name (due to the space); orscale_color_discrete(name = "Forecast Run")
, which has the benefit of using something "easier" like k
(ok, perhaps single-letter variable names are too terse) everywhere but still allowing a good legend name.Each has its benefits/advantages.
Upvotes: 1