Reputation: 159
I have several datasets that span over one year worth of data that collected hourly readings from Mar 2018 to June 2019. I want to be able to isolate one year's worth of data only (ex:1-mar-2018 00:00:00 to 28-feb-2019 23:00:00), and plot that in polar coordinates. However, the problem I run into, is that I want the polar graph to have January (year indifferent) at the top and not march 2018. One caveat to this, is that I do not want to have to calculate how many radians I will need to offset the start of the graph as I have to do this for several different datasets that do not all start from the same point (I have been searching for weeks for how to do this). If the only way to do this is to offset by radians then so be it, but perhaps someone has a better idea.
Here is an example of my dataset:
library(lubridate)
NoOfHours <- as.numeric(ymd_hms("2019-6-1 17:00:00") - ymd_hms("2018-3-01 8:00:00"))*24
data <- as.data.frame(ymd_hms("2018-3-01 8:00:00") + hours(0:NoOfHours))
colnames(data) <- 'date'
set.seed(10)
data$level <- runif(nrow(data), min = 0, max = 150)
The levels range from 0-150. Some of the other datasets go over 200.
Additionally, I want to apply a gradient background colour, that goes from green to red as it ascends from 0 to 200. It will serve to denote high values. Here is an example of what I mean, although it is not a polar plot and it is at an angle (I could not find a good representative image): https://www.google.com/url?sa=i&source=images&cd=&ved=2ahUKEwi9oObnr9bkAhU5HTQIHdS6BdUQjRx6BAgBEAQ&url=https%3A%2F%2Fhelp.principaltoolbox.com%2FEN%2Fscatter_plot.html&psig=AOvVaw2E6Uanev3RNOW2rIbsTISa&ust=1568758598336572
Finally, if possible, I want to have a hole in the centre of the plot, similar to a donut plot so that the lowest values are more readable over a year. Currently I can get started with this problem, but the details are plaguing me. Any help would be appreciated.
I can isolate a year's worth of data and plot it in polar coordinates. I am using this code:
Hours <- format(as.POSIXct(strptime(data$date,"%Y-%m-%d %H:%M:%S",tz="")) ,format = "%H:%M:%S")
data$hours <- Hours
Date <- format(as.POSIXct(strptime(data$date,"%Y-%m-%d %H:%M:%S",tz="")) ,format = "%Y-%m-%d")
data$date_date <- Date #output
library(openair)
yeardata <- selectByDate(data, start = "2018-3-1", end = "2019-2-28", year = 2018:2019)
library(ggplot2)
plot <- ggplot(yeardata, aes(x=date, y=level)) +
geom_line() +
scale_colour_hue(l=50) + # Use a slightly darker palette than normal
geom_smooth(method=lm, # Add linear regression lines
se=FALSE)
plot
plot + coord_polar() + theme_minimal()
This ends up producing this graph: One year plot
Although this is close to what I want, like I mentioned above, I need it to start at January (top of graph) and then possibly have a line to denote the year separation.
Thank you
Upvotes: 1
Views: 491
Reputation: 159
Having integrated what Jon Spring suggested (thanks again!) and some more searching I managed to get almost all the way to what I wanted. Here is the updated code:
library(ggplot2)
plot <- ggplot(yeardata, aes(x=date, y=level, color = level)) +
geom_hline(yintercept = seq(0, 300, by = 50), colour = "black", size = 0.75, alpha = 0.3)+ #make my own gridlines so that when on a white background, the gridlines wont cross the text.
scale_color_gradient(limits = c(0,200), low="green", high="red", oob = scales::squish)+ #need oob = scales::squish to get values over 200 to be red.
geom_jitter(alpha = 0.2, size = 2) +# Use a slightly darker palette than normal
theme(axis.title=element_text(size=16,face="bold"), axis.text.x = element_text(size = 16), axis.text.y = element_text(size = 12))+
labs(x = NULL, y = bquote('Levels '~(m^2)), color = "Level")+ #bquote to allow superscripts
scale_y_continuous(breaks = seq(0, 300, 50),
limits = c(-100,310))
plot
plot + coord_polar(start = ((2*60/365)*pi))+ #need to have the number of radians to get my start position. If march 1st is the start date, then 60 days have past since Jan 1.
theme(legend.title = element_text(color = "black", size = 14, face = "bold"), panel.background = element_rect(fill = "white"), panel.grid = element_blank())
Here is the resulting graph:
Upvotes: 0
Reputation: 66775
Here's an approach which circumvents the coord_polar rotation issue by re-expressing each date as a day in calendar year 2019, so that the data will always start with Jan 1, and that can be the top of the chart. (Otherwise you'd have to adjust each set of data to express how many days into the year the first data is, then multiply that by 2*pi/365 to set your start angle.)
library(dplyr); library(lubridate)
data_1yr <- data %>%
mutate(date19 = ymd(paste(2019, month(date), day(date)))) %>%
mutate(day_num = 1 + (date - min(date))/ddays(1)) %>%
filter(day_num <= 365)
The background shading will plot very slowly if you want to show thousands of separate shaded regions. To get around this, you might want to take a daily average and use that to drive the shading:
data_1yr_daily = data_1yr %>%
group_by(date19) %>%
summarize(level = mean(level))
Then we can plot these two, with the daily averages driving two geom_col
, one in the positive and one in the negative direction. (I had some trouble with geom_tile
and geom_rect
in this context, but those might be better fits for this.) The fill gradient is as you described, and I sued ylim
to specify a wider range than the data, and to make the pie into a donut.
ggplot(data = data_1yr, aes(x=date19, y=level)) +
geom_col(data = data_1yr_daily, aes(fill = level, y = Inf), width = 1) +
geom_col(data = data_1yr_daily, aes(fill = level, y = -50), width = 1) +
geom_line() +
scale_fill_gradient(low = "green", high = "red") +
geom_smooth(method=lm, # Add linear regression lines
se=FALSE) +
coord_polar() +
ylim(c(-150, 200)) +
theme_minimal()
Upvotes: 2