Ryan
Ryan

Reputation: 1068

Averaging time series groups and plotting them against one another

In the time series data created below data, individuals (denoted by a unique ID) were sampled from 2 populations (NC and SC). All individuals have the same number of observations. I want to average the data for each respective "time point" for all individuals that belong to the same "State" (the average line) and I want to plot the average lines from each state against each other. I want it to look something like this:

enter image description here

library(tidyverse)
set.seed(123)
ID <- rep(1:10, each = 500)
Time = rep(c(1:500),10)
Location = rep(c("NC","SC"), each = 2500)
Var <- rnorm(5000)
data <- data.frame(
  ID = factor(ID),
  Time = Time,
  State = Location,
  Variable = Var
)

Upvotes: 1

Views: 38

Answers (1)

JasonAizkalns
JasonAizkalns

Reputation: 20463

I would recommend getting familiar with the various dplyr functions. Specifically, group_by and summarise. You may want to read through: Introduction to dplyr or going through this series of blog posts.

In short, we are grouping the data by the Time and State variable and then summarizing that data with an average (i.e., mean(Variable)). To plot the data, we put Time on our x-axis, the newly created avg_var on our y-axis, and use State to represent color. These are assigned as our chart's aesthetics (i.e., aes(...). Finally, we add the line geom with geom_line() to render the lines on our visualization.

data %>% 
    group_by(Time, State) %>%
    summarise(avg_var = mean(Variable)) %>% 
    ggplot(aes(x = Time, y = avg_var, color = State)) +
    geom_line()

enter image description here

Upvotes: 2

Related Questions