Reputation: 513
I am trying to make a plot where each level of a factor gets its own series. While I am a long time user of R I am not up with some of the latest improvements. For example I have not yet learned ggplot which figures in some related questions but I cannot yet translate what I want to do into ggplot. Here is a simple example:
#library(tidyverse) # uncomment if not loaded
in_data <- read_csv("http://www.nfgarland.ca/National_Custom_Data.csv")
in_data <- in_data %>%
mutate(Tot = in_data$`NUM INFLUENZA DEATHS`+in_data$`NUM PNEUMONIA DEATHS`) %>%
arrange(SEASON) %>%
mutate(SEASON = factor(SEASON,ordered=TRUE))
filter(in_data,SEASON == "2015-16")$Tot %>% plot((1:length(.)),
.,
type = "l",
col = "red",
xlab ="Flu Season Week",
ylab = "Deaths",
ylim = c(2000,7500))
filter(in_data,SEASON == "2016-17")$Tot %>% lines((1:length(.)),., col="orange")
filter(in_data,SEASON == "2017-18")$Tot %>% lines((1:length(.)),. ,col="blue")
filter(in_data,SEASON == "2018-19")$Tot %>% lines((1:length(.)),. ,col="green")
filter(in_data,SEASON == "2019-20")$Tot %>% lines((1:length(.)),., ,col="black")
` As you can see I have learned a number of tidyverse concepts and this code works fine. But I assume there really ought to be a way to do this automagically in the tidyverse without defining each and every lines() separately, I would think, and I cannot identify it. I do know how to handle palettes, so the color changes are no problem. Note also that while there are 52 weeks of data for previous seasons, in this file there are only 24 weeks gone in the present flu season year.
Upvotes: 2
Views: 262
Reputation: 46908
You need to use a for loop, and of course, unlike ggplot2, you got to specify legends as well. Below is a suggestion in base R (good old days) you can do:
library(readr)
library(dplyr)
COLS = c("red","goldenrod","blue","orange","green")
names(COLS) = levels(in_data$SEASON)
plot(NULL,xlim=range(in_data$WEEK),ylim=range(in_data$Tot),
xlab="time",ylab="Tot")
for(nu in levels(in_data$SEASON)){
lines(1:sum(in_data$SEASON == nu),
in_data$Tot[in_data$SEASON == nu],
col = COLS[nu])
}
legend("topright",fill=COLS,names(COLS))
If you need to specify the weeks, since like you mentioned in the comment, it goes from week 40+ to next year.. it might be a bit more code (and maybe pain)
Upvotes: 0
Reputation: 24790
How about like this?
library(ggplot2)
ggplot(in_data, aes(x=WEEK,y=Tot, color = SEASON)) +
geom_line() +
labs(x = "Flu Season Week", y = "Deaths") +
ylim(2000,7500) +
scale_color_manual(values = c("red","goldenrod","blue","orange","green"))
Edit: Addressing OP's comment about wanting to break the 2019-20 data, we can use a quick pivot to fill in the missing values.
in_data %>% dplyr::select(SEASON,Tot,WEEK) %>%
tidyr::pivot_wider(names_from = SEASON, values_from = Tot) %>%
pivot_longer(cols = (-WEEK), names_to = "SEASON", values_to = "Tot") %>%
ggplot(aes(x=WEEK,y=Tot, color = SEASON)) +
geom_line() +
labs(x = "Flu Season Week", y = "Deaths") +
ylim(2000,7500) +
scale_color_manual(values = c("red","goldenrod","blue","orange","green"))
Upvotes: 3