Reputation: 1068
In the time series data created below data
, individuals (denoted by a unique ID
) were sampled from 2 populations (NC
and SC
). All individuals have the same number of observations. I want to average the data for each respective "time point" for all individuals that belong to the same "State" (the average line) and I want to plot the average lines from each state against each other. I want it to look something like this:
library(tidyverse)
set.seed(123)
ID <- rep(1:10, each = 500)
Time = rep(c(1:500),10)
Location = rep(c("NC","SC"), each = 2500)
Var <- rnorm(5000)
data <- data.frame(
ID = factor(ID),
Time = Time,
State = Location,
Variable = Var
)
Upvotes: 1
Views: 38
Reputation: 20463
I would recommend getting familiar with the various dplyr
functions. Specifically, group_by
and summarise
. You may want to read through: Introduction to dplyr or going through this series of blog posts.
In short, we are grouping the data by the Time
and State
variable and then summarizing that data with an average (i.e., mean(Variable)
). To plot the data, we put Time
on our x-axis, the newly created avg_var
on our y-axis, and use State
to represent color. These are assigned as our chart's aesthetics (i.e., aes(...)
. Finally, we add the line geom with geom_line()
to render the lines on our visualization.
data %>%
group_by(Time, State) %>%
summarise(avg_var = mean(Variable)) %>%
ggplot(aes(x = Time, y = avg_var, color = State)) +
geom_line()
Upvotes: 2