Reputation: 1947
I want to create a function which will perform within panel regression :
My work so far
library(plm)
fit_panel_lr <- function(y, x) {
x <- cbind(as.data.frame(x),as.data.frame(y))
varnames <- names(x)[3:(length(x))]
varnames <- varnames[!(varnames == names(y))]
form <- paste0(varnames, collapse = "+")
x_copy <- data.frame(x)
form <- as.formula(paste0(names(y), "~", form))
params <- list(
formula = form, data = x_copy, model = 'within'
)
pglm_env <- list2env(params, envir = new.env())
model_plm <- do.call("plm", params, envir = pglm_env)
summary(model_plm)
}
Let's now see how it works :
data("EmplUK", package="plm")
dep_var <- EmplUK['capital']
df1 <- EmplUK[-6]
> fit_panel_lr(dep_var, df1)
Oneway (individual) effect Within Model
Call:
plm(formula = capital ~ sector + emp + wage + output, data = list(
firm = c(1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6,
6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9,
9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11,
11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13,
13, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15,
16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 1 (...)
)), model = "within")
Unbalanced Panel: n = 123, T = 2-8, N = 866
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-6.7614519 -0.0712417 0.0052943 0.0715363 8.9980402
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
sector 3.9155e-05 1.1484e-04 0.3409 0.73324
emp 2.2427e-01 1.0923e-02 20.5328 < 2e-16 ***
wage -1.9868e-03 1.1987e-02 -0.1657 0.86840
output -6.0120e-03 2.9003e-03 -2.0729 0.03853 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 615.32
Residual Sum of Squares: 384.35
R-Squared: 0.37537
Adj. R-Squared: 0.26886
F-statistic: 111.023 on 4 and 739 DF, p-value: < 2.22e-16
As you can see I get summary at the end, but I get also a lot of unnecessary numbers. Is there any way how these can be omitted ?
By redundant information I mean these numbers after firm = c(1,1,...)
. Question can be rephrased in such way : is there any possibility how data=list(...)
can be deleted from
Call:
plm(formula = capital ~ sector + emp + wage + output, data = list
Upvotes: 1
Views: 37
Reputation: 174338
You need to quote
your data frame name and have that in the passed parameters, without having the actual data frame in the parameters. However, you also need to make sure that your actual data frame is available within the pglm_env
for when that quoted name comes to be evaluated.
fit_panel_lr <- function(y, x) {
x <- cbind(as.data.frame(x),as.data.frame(y))
varnames <- names(x)[3:(length(x))]
varnames <- varnames[!(varnames == names(y))]
form <- paste0(varnames, collapse = "+")
x_copy <- data.frame(x)
form <- as.formula(paste0(names(y), "~", form))
params <- list(
formula = form, model = 'within', data = quote(data_source)
)
pglm_env <- list2env(params, envir = new.env())
pglm_env$data_source <- x_copy
model_plm <- do.call("plm", params, envir = pglm_env)
summary(model_plm)
}
So now you can do:
data("EmplUK", package="plm")
dep_var <- EmplUK['capital']
df1 <- EmplUK[-6]
fit_panel_lr(dep_var, df1)
#> Oneway (individual) effect Within Model
#>
#> Call:
#> plm(formula = capital ~ sector + emp + wage + output, data = data_source,
#> model = "within")
#>
#> Unbalanced Panel: n = 140, T = 7-9, N = 1031
#>
#> Residuals:
#> Min. 1st Qu. Median 3rd Qu. Max.
#> -15.4974740 -0.0708288 0.0034195 0.0744795 9.1816716
#>
#> Coefficients:
#> Estimate Std. Error t-value Pr(>|t|)
#> emp 0.18715725 0.01552895 12.0522 < 2e-16 ***
#> wage 0.03372382 0.01610535 2.0940 0.03655 *
#> output -0.00044427 0.00385459 -0.1153 0.90827
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Total Sum of Squares: 1152.7
#> Residual Sum of Squares: 979.98
#> R-Squared: 0.14986
#> Adj. R-Squared: 0.013909
#> F-statistic: 52.1761 on 3 and 888 DF, p-value: < 2.22e-16
Upvotes: 3