Reputation: 35
I have three tables:
Upper Bound
Q C
1 30
2 50
3 40
Lower Bound
Q C
1 10
2 15
3 20
Bad Data:
Q C Name
1 50 Sample 1
2 40 Sample 1
3 30 Sample 1
1 0 Sample 2
2 60 Sample 2
3 5 Sample 2
I want a graph that graphs the lower and upper bounds in gray and fills everything in between and graph the bad samples on top with different colors and a legend:
plot <- ggplot(Bad_Data, aes(x = Bad_Data$Q, y = Bad_Data$C, group = 1))
plot + geom_line(aes(color = N)) + geom_ribbon(aes(ymin = Lower_Bound$C, ymax = Upper_Bound$C))
I tried that but it gave me this error:
Error: Aesthetics must be either length 1 or the same as the data (624): ymin, ymax, x, y, group
Anyone who can help me?
Upvotes: 1
Views: 30341
Reputation: 1850
I use geom_ribbon
to shade forecast confidence intervals with forecasts. You can combine several data frames to make it all work. You need to pass the data frame, x values (dates) and y values (two lines to shade between).
The following function is used for plotting a series, the model backtest, predictions and confidence intervals.
plot_predictions <- function(start_date) {
# function to use Plotly to plot the predictions,
# confidence interval and actuals
# inputs:
# plot_df, starting date of the series, r-pred (prediction data frame)
# output: interactive plot of the actual and the model backtest
# set the end date parameter first
pred_end <- as.Date(tail(r_pred$week_ending, 1))
p <- ggplot() +
# add the backtest series from the backtest data frame
geom_line(data=r_back, mapping = aes(x=week_ending, y=Backtest_Model,
color = 'Backtest Model'), linetype='solid') +
# add the actual series from the training dataframe
# use aes_string to pass the series variable
geom_line(data=r_train, mapping = aes_string(x='week_ending', y=series,
color = 'series'), linetype='solid') +
# r_pred holds he predictions and confidence levels
geom_line(data=r_pred, mapping = aes(x=week_ending, y=Predictions,
color = 'Predictions'), linetype='solid') +
# upper forecast limit
geom_line(data=r_pred, mapping = aes(x=week_ending, y=upper_conf_limit,
color = 'upper_limit'), linetype='solid') +
# lower forecast limit
geom_line(data=r_pred, mapping = aes(x=week_ending, y=lower_conf_limit,
color = 'lower_limit'), linetype='solid') +
# format the plot
scale_linetype_manual() +
# prediction_pal is a five color palette
scale_color_manual(values = pred_pal, name = "Series") +
# fill between lines with geom_ribbon using a blue toned down by alpha
geom_ribbon(data=r_pred, aes(x = week_ending,
ymin=lower_conf_limit,
ymax=upper_conf_limit),
fill="blue", alpha=0.2) +
# format the y axis with commas
scale_y_continuous(label = comma)
# extend the date series from the beginning to end of predictions
scale_x_date(breaks = pretty_breaks(20),
limits = c(as.Date(start_date), pred_end)) +
# add the custom theme
theme_bryan() +
# customize the plot
# rotate the text of the x axis
theme(axis.text.x = element_text(angle= 45, hjust = 1)) +
labs(fill = 'Series') +
guides(color = guide_legend(reverse = FALSE)) +
# add the holiday lines as vertical red dotted lines
# holidays in the training set
geom_vline(xintercept = as.numeric(holiday$week_ending),
linetype='dotted', colour = 'red', alpha =0.5) +
# holidays in the forecasting set
geom_vline(xintercept = as.numeric(r_exog_lines$week_ending),
linetype='dotted', colour = 'red', alpha =0.5)
# output the plot in Plotly format
subplot(with_options(list(digits = 0),
ggplotly(p))) %>%
layout(legend = list(orientation = 'v', y = .1, x = 0))
}
Using a series called riders series <- 'riders'
yields the following.
Upvotes: 6
Reputation: 28441
Here is a start, you can tweak the colors and other parameters to get the dynamics exactly as you like. I have added a few aesthetics with description of what each does:
#Prepare
Bad_Data$Lower_Bound <- Lower_Bound$C
Bad_Data$Upper_Bound <- Upper_Bound$C
#Plot
library(ggplot2)
p <- ggplot(Bad_Data, aes(x = Q, y = C, color=Name, group=Name))
p <- p + geom_line()
p + geom_ribbon(aes(ymin=Lower_Bound, ymax=Upper_Bound),
alpha=0.1, #transparency
linetype=1, #solid, dashed or other line types
colour="grey70", #border line color
size=1, #border line size
fill="green") #fill color
Here is the data:
Upper_Bound <- read.table(text="Q C
1 30
2 50
3 40", header=T)
Lower_Bound <- read.table(text="Q C
1 10
2 15
3 20", header=T)
Bad_Data <- read.table(text="Q C Name
1 50 Sample1
2 40 Sample1
3 30 Sample1
1 0 Sample2
2 60 Sample2
3 5 Sample2", header=T)
Edit
Safer preparation:
Bad_Data$Lower_Bound <- Lower_Bound$C[match(Bad_Data$Q, Lower_Bound$Q)]
Bad_Data$Upper_Bound <- Upper_Bound$C[match(Bad_Data$Q, Upper_Bound$Q)]
Upvotes: 4