Reputation: 480
I want to plot multiple series where each series have a distinct line type and specific part of each series colored differently using ggplot2.
I prepared the data and plot it as follows:
# Load melted data frame
df = read.table(text="time,group,variable,value
1,train,preds,-1.01327781066807
2,train,preds,-1.06923407042272
3,train,preds,-1.0738165129006
4,train,preds,-1.0570173408663
5,train,preds,-0.849528539296128
6,train,preds,-1.00956150966228
7,train,preds,-1.05129344633106
8,train,preds,-1.01137384052835
9,train,preds,-0.986500386274102
10,train,preds,-0.782545791298946
11,train,preds,-0.492011449844967
12,train,preds,0.0752350668715425
13,train,preds,0.718851922060212
14,train,preds,0.907488713099219
15,train,preds,0.809418859320128
16,train,preds,0.799428598786513
17,train,preds,0.89455950317809
18,train,preds,0.891727059592248
19,train,preds,0.839506291414727
20,train,preds,0.891986330803872
21,train,preds,0.868653513783531
22,train,preds,0.867573512960701
23,train,preds,0.790999769131768
24,test,preds,0.836612851268108
25,test,preds,0.835266880809444
26,test,preds,0.825396293221058
27,test,preds,0.82669719817616
1,train,actual,-1.06741896705375
2,train,actual,-1.07208489151112
3,train,actual,-1.04309035399799
4,train,actual,-1.11384867929676
5,train,actual,-1.10435803969419
6,train,actual,-1.06534456421351
7,train,actual,-1.04953633499216
8,train,actual,-1.05459775190554
9,train,actual,-0.981186588772681
10,train,actual,-0.96224883216766
11,train,actual,-0.892023497056106
12,train,actual,0.830642326040778
13,train,actual,0.834595424714826
14,train,actual,0.881344777367528
15,train,actual,0.915772459185225
16,train,actual,0.929638947563377
17,train,actual,0.994907176661985
18,train,actual,0.99423350946309
19,train,actual,0.989942263051002
20,train,actual,0.967976146034507
21,train,actual,0.787447328638445
22,train,actual,0.586847009899609
23,train,actual,0.84574152360878
24,test,actual,1.01305250589053
25,test,actual,1.06157202086132
26,test,actual,1.01496086957322
27,test,actual,0.999883908716498", sep=",", stringsAsFactors=F, header=T)
# plot
ggplot(data=df, aes(x=time, y=value)) + geom_line(aes(color=group, linetype=variable))
The result is:
There is a break between train and test part of the series. How can I get them connected? I tried geom_path but couldn't do it.
Upvotes: 0
Views: 64
Reputation: 4487
You need to create another data point to connect the line. For example
library(ggplot2)
library(dplyr)
# I get the first record of test data for each variable and assign
# the group variable to train
additional_line <- df %>%
filter(group == "test") %>%
group_by(variable) %>%
filter(time == min(time)) %>%
ungroup() %>%
mutate(group = "train")
# Then binded them into the original dataset
df_revised <- bind_rows(df, additional_line)
# Now plot the new data with train line connected to the test line by
# additional train line.
ggplot(data=df_revised, aes(x=time, y=value)) +
geom_line(aes(color=group, linetype=variable))
# You can draw the original data without additional records if not
# using linetype
ggplot(data=df, aes(x=time, y=value)) +
geom_line(aes(group = variable, color=group))
# When using linetype in plot this config will cause errors
ggplot(data=df, aes(x=time, y=value)) +
geom_line(aes(group = variable, color=group, linetype = variable))
#> Error: geom_path: If you are using dotted or dashed lines, colour,
#> size and linetype must be constant over the line
Upvotes: 1