Reputation: 2663
I'm struggling with what seems like it should be a simple extension of a previous question I'd asked here.
I'm trying to aggregate over (a) a range of dates and (b) a factor variable. Sample data might be:
Brand Day Rev RVP
A 1 2535.00 195.00
B 1 1785.45 43.55
C 1 1730.87 32.66
A 2 920.00 230.00
B 2 248.22 48.99
C 3 16466.00 189.00
A 1 2535.00 195.00
B 3 1785.45 43.55
C 3 1730.87 32.66
A 4 920.00 230.00
B 5 248.22 48.99
C 4 16466.00 189.00
Thanks to helpful advice, I've figured out how to find the mean revenue for brands over a period of days using data.table:
new_df<-df[,(mean(Rev)), by=list(Brand,Day)]
Now, I'd like to create a new table where there is a column listing the coefficient estimate from an OLS regression of Rev by Day for each brand. I tried to do this as follows:
new_df2<-df[,(lm(Rev~Day)), by=list(Brand)]
That doesn't seem quite right. Thoughts? I'm sure it's something obvious I've missed.
Upvotes: 4
Views: 5467
Reputation: 115392
You have a couple of options.
You could save the whole model object as a list within the data.table
models <- df[, list(model = list(lm(Rev ~ Day))),by = Brand]
models
Brand model
1: A <lm>
2: B <lm>
3: C <lm>
# look at the models
models[,print(model[[1]]),by= Brand]
Call:
lm(formula = Rev ~ Day)
Coefficients:
(Intercept) Day
2804.2 -538.3
Call:
lm(formula = Rev ~ Day)
Coefficients:
(Intercept) Day
1741.5 -263.5
Call:
lm(formula = Rev ~ Day)
Coefficients:
(Intercept) Day
-3698 4653
You can save the coefficients
models[, {coefs <- coef(model[[1]])
list(coefs = coefs, name = names(coefs))}, by = Brand]
## Brand coefs name
## 1: A 2804.1667 (Intercept)
## 2: A -538.3333 Day
## 3: B 1741.5291 (Intercept)
## 4: B -263.5251 Day
## 5: C -3697.8621 (Intercept)
## 6: C 4653.1989 Day
Or you can just extract the list of models
models[,model]
Upvotes: 6
Reputation: 263342
> DF <- read.table(text="Brand Day Rev RVP
+ A 1 2535.00 195.00
+ B 1 1785.45 43.55
+ C 1 1730.87 32.66
+ A 2 920.00 230.00
+ B 2 248.22 48.99
+ C 3 16466.00 189.00
+ A 1 2535.00 195.00
+ B 3 1785.45 43.55
+ C 3 1730.87 32.66
+ A 4 920.00 230.00
+ B 5 248.22 48.99
+ C 4 16466.00 189.00", header=TRUE)
> DT <- data.table(DF)
> Mod.tbl<-DT[, list(mod=list(lm(Rev~Day))), by=list(Brand)]
> Mod.tbl[ , coef(mod[[1]])["Day"], by= Brand]
Brand V1
1: A -538.3333333
2: B -263.5251429
3: C 4653.1989474
Upvotes: 1
Reputation: 60070
I think this is what you want:
new_df2<-df[,(lm(Rev~Day)$coefficients[["Day"]]), by=list(Brand)]
lm
returns a full model object, you need to drill down into it to get a single value from each group that can be turned into a column.
Upvotes: 3