Willard
Willard

Reputation: 522

How to extract specific residual data from a linear model in R

How would i extract extract residual data for a specific baseball team in the following linear model? For example, how would I extract the residuals for "CLE"?

library(Lahman)
library(dplyr)
library(broom)

# create baseball team data
data(Teams)
teams <- Teams
teams <- teams %>% mutate(win_percentage = (W / (W + L)) * 100)

# summarize baseball team salary by year
salaries <- Salaries
salaries <- salaries %>% 
  group_by(teamID, yearID, lgID) %>%
  summarise(payroll_M = sum(as.numeric(salary)) / 10^6) %>% 
  ungroup()

# add winning percentage to the salary table
salaries <- teams %>% 
  select(yearID, teamID, win_percentage) %>% 
  right_join(salaries, by = c("yearID", "teamID"))

# compute linear model of winning vs team salary
model <- salaries %>% 
  group_by(yearID) %>%
  do(fit = augment(lm(win_percentage ~ payroll_M, data = .)))

# extract residuals for Cleveland ??????

Upvotes: 1

Views: 1836

Answers (1)

David Robinson
David Robinson

Reputation: 78610

You're close, but need two changes to the augment line.

  1. You're saving the resulting (augmented) data frame to a column called fit. Instead, try giving it directly to do (remove the fit =).

  2. The augment function needs to keep the teamID column as part of the resulting data, even though it's not in the model. Note that augment takes a second argument data for exactly this purpose (see help(augment.lm) for more).

Thus, the new line would look like:

do(augment(lm(win_percentage ~ payroll_M, data = .), data = .))

The resulting data frame will have one row per original observation, and will include the teamID along with the residuals and fitted values (which allows you to filter for CLE).

Upvotes: 4

Related Questions