Reputation: 157
I merged a list of lm summaries with a data.table.
Next, I extract a coefficient from the summaries and generate a new column containing the number.
My question is, why do I have to use lapply to complete such work?
For example, the following code worked.
new.DT <- old.DT[, result := lapply(X = results.of.lm, FUN = summary)] %>%
.[, beta := lapply(X = result, function(x) x$coefficients[2,1])]
While the following code failed.
new.DT <- old.DT[, result := lapply(X = results.of.lm, FUN = summary)] %>%
.[, beta := result$coefficients[2,1])]
However, for some functions I apply to columns in data.table (e.g. paste, substr or as.numeric), lapply was not necessary. I couldn't figure out what cause the difference. Thank you!
Upvotes: 0
Views: 559
Reputation: 24722
Presuming that old.DT
looks something like this:
grp results.of.lm
1: 1 <lm[12]>
2: 2 <lm[12]>
3: 3 <lm[12]>
then, after creating the result column using old.DT[, result:=lapply(results.of.lm,summary)]
, we note that result
is a list of lists
grp results.of.lm result
1: 1 <lm[12]> <summary.lm[11]>
2: 2 <lm[12]> <summary.lm[11]>
3: 3 <lm[12]> <summary.lm[11]>
Note that old.DT$result$coefficients is NULL, and thus beta:=result$coefficients[2,1]
will NOT return the desired result. Instead, your use of lapply does return the coefficients:
old.DT[, beta:=lapply(result, function(x) x$coefficients[2,1])][]
grp results.of.lm result beta
1: 1 <lm[12]> <summary.lm[11]> -0.2819342
2: 2 <lm[12]> <summary.lm[11]> 0.1645671
3: 3 <lm[12]> <summary.lm[11]> 0.2215897
Upvotes: 1