Reputation: 409
I am plotting a time series of wind speed and would like to color the lines according to season. There are a couple of missing data across the data set, with one gap being a couple of months. When I plot the ggplot with the coloring according to the season, unfortunately, it plots a connecting line from the end of the season (e.g. winter) to the next time this season (e.g winter) appears. How can I stop it from doing that?
here is an excerpt from my data:
date wspd_havg10m_kn avg_wd season
1 2013-12-06 00:25:00 9.8358531 50 Winter
2 2013-12-06 01:25:00 10.5064795 56 Winter
3 2013-12-06 02:25:00 11.8477322 55 Winter
4 2013-12-06 03:25:00 NA 53 NA
5 2013-12-06 04:25:00 13.1889849 47 Winter
6 2013-12-06 05:25:00 13.1889849 60 Winter
7 2013-12-06 06:25:00 NA 51 NA
8 2013-12-06 07:25:00 9.6123110 50 Winter
9 2013-12-06 08:25:00 7.6004320 53 Winter
10 2013-12-06 09:25:00 9.6123110 52 Winter
11 2013-12-06 10:25:00 8.2710583 66 Winter
# add column that specifies the season
mydata$season<-time2season(mydata$date, out.fmt="seasons", type="default")
#capitalize season categories
mydata$season<-capitalize(mydata$season)
g<-ggplot(mydata, aes(date, wspd_havg10m_kn, color=season))+
geom_line(size=0.1) +
geom_smooth(colour = "black",size = 1, method = "gam", formula = y ~ s(x), bs = "cs") +
scale_y_continuous(limits = c(0,45), breaks = seq(0,45,5))+
scale_color_discrete(name="Season", breaks=c("Spring","Summer","Autumm", "Winter"))+
xlab("\nSampling Period (mm/yy)\n") +
ylab("Hourly Wind Speed Sample (kt)\n")
# adjust the way labels and ticks are set on the x axis:
g+ scale_x_datetime(breaks = date_breaks ("2 months"), labels= date_format ("%m/%y"), limits=c(start_date, end_date))
I tried setting the season to NA when I was missing wind speed but that didn't do anything. I am still left with connecting lines between the last season to the next season...
any ideas? cheers sandra
Upvotes: 4
Views: 1840
Reputation: 22827
I don't see it as an exact duplicate because of the colors and the NAs. I think you are looking for something like this:
# Read the data
library(lubridate)
df <- read.csv("data.csv",
strip.white=T,
colClasses=c("character","numeric","numeric","factor"))
df$date <- ymd_hms(df$date,tz="UCT")
#define our group variable and plot it
df$grp <- cumsum(is.na(df$wind))
ggplot(data=df[complete.cases(df),],aes(date,wind,color=season)) +
geom_line(aes(group=grp)) +
scale_color_manual(values=c("Fall"="brown","Winter"="darkblue"))
Here is the data I used
date,wind,temp,season
2013-12-20 18:25:00, 9.8358531, 50, Fall
2013-12-20 19:25:00, 10.5064795, 56, Fall
2013-12-20 20:25:00, 11.8477322, 55, Fall
2013-12-20 21:25:00, NA, 53, NA
2013-12-20 22:25:00, 13.1889849, 47, Fall
2013-12-20 23:25:00, 13.1889849, 60, Fall
2013-12-21 01:25:00, NA, 51, NA
2013-12-21 02:25:00, 9.6123110, 50, Winter
2013-12-21 03:25:00, 7.6004320, 53, Winter
2013-12-21 04:25:00, 9.6123110, 52, Winter
2013-12-21 05:25:00, 8.2710583, 66, Winter
Upvotes: 3