MKas
MKas

Reputation: 25

Season plot in R

I have a dataset with dates and averages which looks like that:

Dates AVG
2019-04-01 29.2
2019-08-01 29.5
2020-12-01 15.6
2019-02-01 28.7
2020-01-01 16.3
2019-07-01 29.6

The Dates column was characters so I used:

library(lubridate)
my_data$Dates <- ymd(my_data$Dates)

So, now my dataset has dates format for the Dates column, and numeric for the AVG. I want to do a scatter plot, which will connect the dots of the AVG of each year separately. So, I want to have one line for 2019 and one different line for 2020 but in the same plot.

I saw online that I can do something like that, if I use the ggseasonplot of the forecast library, but this command excpect you to have xts data and in my case only the first column is in dates format. I tries to convert the entire dataset to a xts dataset using that:

library(xts)
xts_data <- as.xts(my_data)

but I am getting back the following error because my data are not in POSIXlt format:

Error in as.POSIXlt.character(x, tz, ...) : 
  character string is not in a standard unambiguous format

I don't know how I can make my data to be in a correct format to use the ggseasonplot. Any help on that would be appreciated. Also, any other suggestion on another way to do the plot I want (not using ggseasonplot at all), would be very much appreciated as well.

Thanks in advance!

Upvotes: 0

Views: 661

Answers (1)

marcguery
marcguery

Reputation: 586

You might be interested by ggplot2, which can give nice plots quite rapidly once you get used to its syntax and logic.

To get the output of ggseasonplot with ggplot2, you can use functions from lubridate such as month, to get your x coordinates and year, to group your observations per year.

library(ggplot2)

ggplot(data = my_data, 
             mapping = aes(x = month(Dates, label = TRUE), 
                           y = AVG))+
  geom_point()+
  geom_line(aes(group = factor(year(Dates)),
                color = factor(year(Dates))))+
  scale_color_discrete(name = "Year")+
  xlab(label = "Month")

enter image description here

If you want to show the months without data:

ggplot(data = my_data, 
             mapping = aes(x = month(Dates, label = TRUE), 
                           y = AVG))+
  geom_blank(data=data.frame(),
             aes(x = month(seq(ymd('2020-01-01'),
                           ymd('2020-12-01'), 
                           by = '1 month'),
                       label = TRUE),
                 y = NULL))+
  geom_point()+
  geom_line(aes(group = factor(year(Dates)),
                color = factor(year(Dates))))+
  scale_color_discrete(name = "Year")+
  xlab(label = "Month")

enter image description here

Raw data

my_data <- read.table(header = TRUE,
             text="Dates    AVG
             2019-04-01     29.2
             2019-08-01     29.5
             2020-12-01     15.6
             2019-02-01     28.7
             2020-01-01     16.3
             2019-07-01     29.6")

Upvotes: 1

Related Questions