Reputation: 1
I am using R
I have a panel dataset of ~5000 observations of 250 individuals over time.
I need to build a difference in difference regression, therefore I draw a random observation for each individual and I run a regression:
lm(x ~ x1 + x2 + ... , data = ddply(df,.(individual),function(x) x[sample(nrow(x),1),]))
over the resulting sample.
I need to compute the regression n
times on n different random samples and compute the average of each estimator.
Is there a way to do this efficiently without manually computing and averaging n
regressions?
Upvotes: 0
Views: 286
Reputation: 1
Solved:
I expected to find a specific package to do it but I built a function instead. For example, for n = 700
fun <- function(alfa){
alfa <-ddply(df,.(individual),function(x) x[sample(nrow(x),1),])
beta <- lm(x ~ x1 + x2 + ... , data = alfa )$coefficients
return(beta)
}
df.full <- replicate(700,fun(alfa))
This way a dataset with 700 row is created, with the coefficient names as row. I can do even something like this:
fun <- function(alfa){
alfa <-ddply(df,.(individual),function(x) x[sample(nrow(x),1),])
beta <- lm(x ~ x1 + x2 + ... , data = alfa)
gamma <- summary(beta)[["coefficients"]][,1]
return(gamma)
}
df.full <- replicate(700,fun(alfa))
Changing [,1] with [,2] I will obtain the standard errors. After this, the means' computing follows directly.
Upvotes: 0