Reputation: 95
I have a single data frame consisting of x unique combinations of region and channel. I need to create a distinct regression model for each the x combinations using some sort of a loop.
region channel date trials spend
EMEA display 2015-01-01 62 17875.27
APAC banner 2015-01-01 65 18140.93
Something to the effect of
i=1
j=1
for r in region{
for ch in channel{
df1 = df[df$region == r & df$channel == ch, ]
model[[i,j]] = lm(trials ~ spend, data = df1)
j = j+1}
i = i+1 }
If someone also knew a way of storing a unique identifier such as region+channel to help identify the regression models that would be very helpful too.
Upvotes: 1
Views: 101
Reputation: 226911
A plyr
solution:
set.seed(1)
d <- data.frame(region = letters[1:2],
channel = LETTERS[3:6],
trials = runif(20),
spend = runif(20))
Make a list of results (i.e. split d
by region and channel, run lm
on each chunk with the specified formula, return results as a list)
library(plyr)
res <- dlply(d,c("region","channel"), lm,
formula=trials~spend)
Extract coefficients as a data frame:
ldply(res,coef)
## region channel (Intercept) spend
## 1 a C 0.3359747 0.2444105
## 2 a E 0.7767959 -0.3745419
## 3 b D 0.7409942 -0.8084751
## 4 b F 1.0797439 -1.0872158
Note that the result has your desired region/channel identifiers in it ...
Upvotes: 3
Reputation: 56259
Use split
the data into 2 column combinations as list, then run lm
within loop - lapply
for each subset of data, see this example:
# dummy data
set.seed(1)
d <- data.frame(region = letters[1:2],
channel = LETTERS[3:6],
trials = runif(20),
spend = runif(20))
# split by 2 column combo
dSplit <- split(d, paste(d$region, d$channel, sep = "_"))
# run lm for each subset
res <- lapply(dSplit, lm, formula = trials ~ spend)
# check names
names(res)
# [1] "a_C" "a_E" "b_D" "b_F"
# lm result for selected combo "a_C"
res$a_C
# Call:
# lm(formula = trials ~ spend, data = i)
#
# Coefficients:
# (Intercept) spend
# 0.3360 0.2444
Upvotes: 2