Reputation: 31
I have a data set "keywords" with several groups. I want to apply glm to each group individually to create a list of glm fits with one fit for each group.
I could do this with a for loop, but thats not in the R spirit. Instead, I tried to do it with a by function:
CTR.glm <- by(keywords,keywordsInSample,
function(x) ifelse(nlevels(factor(x$AveragePosition))>20, # only these keywords will be fit
glm(Clicks ~ poly(log(AveragePosition),2) + offset(log(Impressions)),
family = poisson,data = x),
NA)) # for functions that can't be fit
The problem is that whereas glm normally returns a glm-class object from which I can extract all sorts of goodies, by returns a list
> CTR.glm[2]
$`text of second keyword`
(Intercept) poly(log(AveragePosition), 2)1 poly(log(AveragePosition), 2)2
-3.626237 -5.108795 -1.751032
> class(CTR.glm[2])
[1] "list"
All information has been lost except for the parameters of the fit. Is there a way to force by to keep all the information about the list?
p.s., I tried using the plyr toolbox, but it got stuck because my keywords have spaces in them.
p.p.s., this post should have the tag "by", but I can't create that tag (new to stackoverflow), could someone retag it?
Upvotes: 1
Views: 491
Reputation: 2950
Try
lapply(CTR.glm,summary)
The list probably contains model objects, which still have the information you need
Upvotes: 2
Reputation: 16607
I think plyr
should work just fine. I don't know the structure of your keywords
and keywordsInSample
, but consider that this toy example works fine:
require(plyr)
#generate some fake data, with a factor whose names have spaces in them
l <- c(rep("a a", 3), rep("a", 3), rep("b b", 3))
x <- rep(1:3, 3)
y <- rnorm(9)
d <- data.frame(keywordsInSample=grp, x=x, y=y)
lmList <- dlply(d, .(keywordsInSample), function(df) glm(df$y~df$x))
lmList$"a a"
As long as your index variable can be forced into a factor, R will internally represent it as numeric levels, and shouldn't care about what the names of the levels contain.
Upvotes: 0