Jeff Henderson
Jeff Henderson

Reputation: 689

How to calculate correlation by group

I am trying to run an iterative for loop to calculate correlations for levels of a factor variable. I have 16 rows of data for each of 32 teams in my data set. I want to correlate year with points for each of the teams individually. I can do this one by one but want to get better at looping.

correlate <- data %>%
  select(Team, Year, Points_Game) %>% 
  filter(Team == "ARI") %>% 
  select(Year, Points_Game)

cor(correlate)

I made an object "teams" by:

teams <- levels(data$Team)

A little help in using [i] to iterate over all 32 teams to get each teams correlation of year and points would be greatly helpful!

Upvotes: 6

Views: 12397

Answers (2)

DanY
DanY

Reputation: 6073

The data.table way:

library(data.table)

# dummy data (same as @Aleksandr's)
dat <- data.table(
  Team = sapply(1:32, function(x) paste0("T", x)),
  Year = rep(c(2000:2009), 32),
  Points_Game = rnorm(320, 100, 10)
)

# find correlation of Year and Points_Game for each Team
result <- dat[ , .(r = cor(Year, Points_Game)), by = Team]

Upvotes: 3

Aleksandr
Aleksandr

Reputation: 1914

require(dplyr)

# dummy data
data = data.frame(
  Team = sapply(1:32, function(x) paste0("T", x)),
  Year = rep(c(2000:2009), 32),
  Points_Game = rnorm(320, 100, 10)
)

# find correlation of Year and Points_Game for each team
# r - correlation coefficient
correlate <- data %>%
                group_by(Team) %>% 
                summarise(r = cor(Year, Points_Game))

Upvotes: 8

Related Questions