Hydro
Hydro

Reputation: 1117

Spaghetti plot using ggplot in R?

I would like to produce a speghatii plot where i need to see days of the year on the x-axis and data on the y-axis for each Year. I would then want a separate year that had data for only 3 months (PCPNewData) to be plotted on the same figure but different color and bold line. Here is my sample code which produce a graph (attached) where the data for each Year for a particular Day is stacked- i don't want bar graph. I would like to have a line graph. Thanks

library(tidyverse)
library(tidyr)

myDates=as.data.frame(seq(as.Date("2000-01-01"), to=as.Date("2010-12-31"),by="days"))
colnames(myDates) = "Date"
Dates = myDates %>% separate(Date, sep = "-", into = c("Year", "Month", "Day"))

LatestDate=as.data.frame(seq(as.Date("2011-01-01"), to=as.Date("2011-03-31"),by="days"))
colnames(LatestDate) = "Date"
NewDate = LatestDate %>% separate(Date, sep = "-", into = c("Year", "Month", "Day"))

PCPDataHis = data.frame(total_precip = runif(4018, 0,70), Dates)
PCPNewData = data.frame(total_precip = runif(90, 0,70), NewDate)

PCPDataHisPlot =PCPDataHis %>% group_by(Year) %>% gather(key = "Variable", value = "Value", -Year, -Day,-Month)

ggplot(PCPDataHisPlot, aes(Day, Value, colour = Year))+
  geom_line()+
  geom_line(data = PCPNewData, aes(Day, total_precip))

enter image description here

I would like to have a Figure like below where each line represent data for a particular year enter image description here

UPDATE: I draw my desired figure with hand (see attached). I would like to have all the days of the Years on x-axis with its data on the y-axis enter image description here

Upvotes: 2

Views: 4657

Answers (1)

dc37
dc37

Reputation: 16178

You have few errors in your code.

First, your days are in character format. You need to pass them in a numerical format to get line being continuous.

Then, you have multiple data for each days (because you have 12 months per year), so you need to summarise a little bit these data:

Pel2 <- Pelly2Data %>% group_by(year,day) %>% summarise(Value = mean(Value, na.rm = TRUE))
  Pel3 <- Pelly2_2011_3months %>% group_by(year, day) %>% summarise(total_precip = mean(total_precip, na.rm = TRUE))


ggplot(Pel2, aes(as.numeric(day), Value, color = year))+
  geom_line()+
  geom_line(data = Pelly2_2011_3months, aes(as.numeric(day), y= total_precip),size = 2)

enter image description here

It looks better but it is hard to apply a specific color pattern

To my opinion, it will be less confused if you can compare mean of each dataset, such as:

library(tidyverse)
Pel2 <- Pelly2Data %>% group_by(day) %>% 
    summarise(Mean = mean(Value, na.rm = TRUE),
                           SEM = sd(Value,na.rm = TRUE)/sqrt(n())) %>%
    mutate(Name = "Pel_ALL")
Pel3 <- Pelly2_2011_3months %>% group_by(day) %>% 
    summarise(Mean = mean(total_precip, na.rm = TRUE),
                           SEM = sd(total_precip, na.rm = TRUE)/sqrt(n())) %>%
    mutate(Name = "Pel3")

Pel <- bind_rows(Pel2,Pel3)

ggplot(Pel, aes(x = as.numeric(day), y = Mean, color = Name))+
    geom_ribbon(aes(ymin = Mean-SEM, ymax = Mean+SEM), alpha = 0.2)+
    geom_line(size = 2)

enter image description here


EDIT: New graph based on update

To get the graph you post as a drawing, you need to have the day of the year and not the day of the month. We can get this information by setting a date sequence and extract the day of the year by using yday function from `lubridate package.

library(tidyverse)
library(lubridate)
Pelly2$Date = seq(ymd("1990-01-01"),ymd("2010-12-31"), by = "day")
Pelly2$Year_day <- yday(Pelly2$Date)

Pelly2_2011_3months$Date <- seq(ymd("2011-01-01"), ymd("2011-03-31"), by = "day")
Pelly2_2011_3months$Year_day <- yday(Pelly2_2011_3months$Date)

Pelly2$Dataset = "ALL"
Pelly2_2011_3months$Dataset = "2011_Dataset"

Pel <- bind_rows(Pelly2, Pelly2_2011_3months)

Then, you can combine both dataset and represent them with different colors, size, transparency (alpha) as show here:

ggplot(Pel, aes(x = Year_day, y = total_precip, color = year, size = Dataset, alpha = Dataset))+
  geom_line()+
  scale_size_manual(values = c(2,0.5))+
  scale_alpha_manual(values = c(1,0.5))

enter image description here

Does it answer your question ?

Upvotes: 3

Related Questions