Reputation: 25
I have a dataset with dates and averages which looks like that:
Dates | AVG |
---|---|
2019-04-01 | 29.2 |
2019-08-01 | 29.5 |
2020-12-01 | 15.6 |
2019-02-01 | 28.7 |
2020-01-01 | 16.3 |
2019-07-01 | 29.6 |
The Dates column was characters so I used:
library(lubridate)
my_data$Dates <- ymd(my_data$Dates)
So, now my dataset has dates format for the Dates column, and numeric for the AVG. I want to do a scatter plot, which will connect the dots of the AVG of each year separately. So, I want to have one line for 2019 and one different line for 2020 but in the same plot.
I saw online that I can do something like that, if I use the ggseasonplot of the forecast library, but this command excpect you to have xts data and in my case only the first column is in dates format. I tries to convert the entire dataset to a xts dataset using that:
library(xts)
xts_data <- as.xts(my_data)
but I am getting back the following error because my data are not in POSIXlt format:
Error in as.POSIXlt.character(x, tz, ...) :
character string is not in a standard unambiguous format
I don't know how I can make my data to be in a correct format to use the ggseasonplot. Any help on that would be appreciated. Also, any other suggestion on another way to do the plot I want (not using ggseasonplot at all), would be very much appreciated as well.
Thanks in advance!
Upvotes: 0
Views: 661
Reputation: 586
You might be interested by ggplot2, which can give nice plots quite rapidly once you get used to its syntax and logic.
To get the output of ggseasonplot with ggplot2
, you can use functions from lubridate
such as month
, to get your x coordinates and year
, to group your observations per year.
library(ggplot2)
ggplot(data = my_data,
mapping = aes(x = month(Dates, label = TRUE),
y = AVG))+
geom_point()+
geom_line(aes(group = factor(year(Dates)),
color = factor(year(Dates))))+
scale_color_discrete(name = "Year")+
xlab(label = "Month")
If you want to show the months without data:
ggplot(data = my_data,
mapping = aes(x = month(Dates, label = TRUE),
y = AVG))+
geom_blank(data=data.frame(),
aes(x = month(seq(ymd('2020-01-01'),
ymd('2020-12-01'),
by = '1 month'),
label = TRUE),
y = NULL))+
geom_point()+
geom_line(aes(group = factor(year(Dates)),
color = factor(year(Dates))))+
scale_color_discrete(name = "Year")+
xlab(label = "Month")
Raw data
my_data <- read.table(header = TRUE,
text="Dates AVG
2019-04-01 29.2
2019-08-01 29.5
2020-12-01 15.6
2019-02-01 28.7
2020-01-01 16.3
2019-07-01 29.6")
Upvotes: 1