Reputation: 27
I need to plot a time series graph but the data that I'm using is proving to be quite challenging.
Ideally, I'd like a graph that looks something like this:
But mine looks like this:
I have tried a series of different things but none of them have worked.
The dataset can be found here and I'll attach a picture of what the dataset itself looks like:
some code I have tried includes
ggplot( aes(x=date, y=northEast)) +
geom_area(fill="#69b3a2", alpha=0.5) +
geom_line(color="#69b3a2") +
ylab("test") +
theme_ipsum()
ggplot(covidData2) +
geom_line(
mapping = aes(x = weekBeginning, y=northEast, group=northEast)
)
Any help would be greatly appreciated!
Upvotes: 1
Views: 226
Reputation: 174586
You need to tidy your data up before plotting it. If you look at your data frame, all of the "numeric" columns have been interpreted as character vectors because the column names are nested and therefore appear in the first couple of rows. You need to consolidate these and convert them to column names. Then, you need to convert the numeric columns to numbers. Finally, you need to parse the dates, as ggplot will simply read the periods as character vectors:
library(readxl)
library(lubridate)
library(ggplot2)
library(hrbrthemes)
wb <- read_xlsx(path.expand("~/covid.xlsx"), sheet = "Table 9")
df <- as.data.frame(wb)
df[1, 1] <- ""
for(i in 2:length(df)) {
if(is.na(df[1, i])) df[1, i] <- df[1, i - 1]
}
nms <- trimws(paste(df[1,], df[2,]))
df <- df[-c(1:2),]
names(df) <- nms
df <- df[sapply(df, function(x) !all(is.na(x)))]
df[-1] <- lapply(df[-1], as.numeric)
df <- head(df, -3)
df$Period <- dmy(substr(df$Period, 1, 10))
Now we can plot:
ggplot(df, aes(x = Period, y = `North East Rate`)) +
geom_area(fill = "#69b3a2", alpha=0.5) +
geom_line(color = "#69b3a2") +
ylab("Rate per 100,000") +
xlab("") +
theme_ipsum()
Created on 2022-03-08 by the reprex package (v2.0.1)
Upvotes: 3