Reputation: 85
I am performing multiple pooled cross-sectional regressions with a loop function and stored the regression outputs in a list (regression). What i would like to do now is to efficiently obtain the average coefficients, the average t-stats as well as the average adj.r squared.
I put up the following code already:
library(plm)
data("Grunfeld", package="plm")
# create list with regression outputs
regression <- list()
# Regression on past six-year subsets of Grunfeld in every year from 1940 to 1950
for(t in 1940:1950){
regression[[as.character(t)]] <- lm(inv ~ value + capital,
subset(Grunfeld, year<=t & year>=t-5))
}
This way I obtain the desired regression output stored in a list (regression). What i would like to do now is to efficiently obtain the average coefficients, the average t-stats as well as the average adj.r squared.
I already tried to calculated the mean of all the adj. r squared with the following:
mean(lapply(regression, function(x) summary(x)$adj.r.squared))
However it seems that I am using the mean function wrong as i get the following error.
Warning message:
In mean.default(lapply(regression, function(x) summary(x)$adj.r.squared)) :
argument is not numeric or logical: returning NA
Further i came up with the following to "extract" the coefficients.
lapply(regression, function(x) summary(x)$coefficients)
How can I quickly obtain the average individual coefficients from this lapply output? (i.e extracting each row individually and calculate the respective mean over the years.)
$`1940`
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.65239712 14.647050149 -0.2493606 8.039783e-01
value 0.08283141 0.006873563 12.0507230 2.615793e-17
capital 0.11033307 0.091543522 1.2052526 2.330857e-01
$`1941`
Estimate Std. Error t value Pr(>|t|)
(Intercept) -13.77258038 16.677399231 -0.825823 4.123477e-01
value 0.08614094 0.007258571 11.867480 4.904857e-17
capital 0.18680229 0.094849038 1.969470 5.376624e-02
....
Upvotes: 1
Views: 513
Reputation: 887158
You could try:
library(reshape2)
dcast(melt(lapply(regression,
function(x) summary(x)$coefficients)), Var1~Var2, value.var="value", mean)
# Var1 Estimate Std. Error t value Pr(>|t|)
#1 (Intercept) -16.7072859 16.0876958 -1.029145 3.320868e-01
#2 value 0.1107460 0.0076057 14.599109 1.510115e-17
#3 capital 0.1279743 0.0685896 1.833861 9.389504e-02
Or
Reduce(`+`,lapply(regression, function(x) summary(x)$coefficients))/length(regression)
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) -16.7072859 16.0876958 -1.029145 3.320868e-01
#value 0.1107460 0.0076057 14.599109 1.510115e-17
#capital 0.1279743 0.0685896 1.833861 9.389504e-02
Upvotes: 2
Reputation: 2988
You almost had it right! Try this:
> sapply(regression, function(x) mean(summary(x)$adj.r.squared))
1940 1941 1942 1943 1944 1945 1946 1947 1948 1949
0.7230061 0.7293396 0.7399216 0.7770505 0.7998859 0.8413422 0.8571037 0.8561229 0.8348950 0.8357761
1950
0.8324654
You may as well use lapply()
in the above. The same can be done for any coefficient, t-test, once you identify how to extract them from the coefficient table.
To extract coefficients for value
you can proceed as follows:
lapply(regression, function(x) summary(x)$coefficients[ rownames(summary(x)$coefficients)=="value", ])
A more compact version would be:
sapply(regression, function(x) summary(x)$coefficients[ rownames(summary(x)$coefficients)=="value", ])
From the above you can obtain means as follows:
> (x <- t(sapply(regression, function(x) summary(x)$coefficients[ rownames(summary(x)$coefficients)=="value", ])))
Estimate Std. Error t value Pr(>|t|)
1940 0.08283141 0.006873563 12.05072 2.615793e-17
1941 0.08614094 0.007258571 11.86748 4.904857e-17
1942 0.09018745 0.007711639 11.69498 8.898811e-17
1943 0.09945565 0.007751087 12.83119 1.886416e-18
1944 0.10568804 0.007376523 14.32762 1.516617e-20
1945 0.11358598 0.006722166 16.89723 7.314450e-24
1946 0.12227763 0.006781509 18.03104 3.203995e-25
1947 0.12599497 0.007199027 17.50167 1.356383e-24
1948 0.12605599 0.008005481 15.74621 2.030259e-22
1949 0.12951740 0.008452725 15.32256 7.175275e-22
1950 0.13647072 0.009530406 14.31951 1.555615e-20
> colMeans(x)
Estimate Std. Error t value Pr(>|t|)
1.107460e-01 7.605700e-03 1.459911e+01 1.510115e-17
This said, this looks suspiciously like the Fama-MacBeth estimates: Fama MacBeth standard errors in R. These can be easily obtained using pmg()
in plm
.
Upvotes: 4