How to omit redundant huge information?

Question

I want to create a function which will perform within panel regression :

My work so far

library(plm)
fit_panel_lr <- function(y, x) {

  x        <- cbind(as.data.frame(x),as.data.frame(y))
  varnames <- names(x)[3:(length(x))]
  varnames <- varnames[!(varnames == names(y))]
  form     <- paste0(varnames, collapse = "+")
  x_copy   <- data.frame(x)
  form     <- as.formula(paste0(names(y), "~", form))

  params   <- list(
    formula = form, data = x_copy, model = 'within'
  )
  pglm_env <- list2env(params, envir = new.env())

  model_plm <- do.call("plm", params, envir = pglm_env)


  summary(model_plm)
}

Let's now see how it works :

data("EmplUK", package="plm")
dep_var <- EmplUK['capital']
df1     <- EmplUK[-6]
> fit_panel_lr(dep_var, df1)
Oneway (individual) effect Within Model

Call:
plm(formula = capital ~ sector + emp + wage + output, data = list(
    firm = c(1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 
    4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 
    6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 
    9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 
    11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 
    13, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 
    16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 1 (...)
    )), model = "within")


Unbalanced Panel: n = 123, T = 2-8, N = 866

Residuals:
      Min.    1st Qu.     Median    3rd Qu.       Max. 
-6.7614519 -0.0712417  0.0052943  0.0715363  8.9980402 

Coefficients:
          Estimate  Std. Error t-value Pr(>|t|)    
sector  3.9155e-05  1.1484e-04  0.3409  0.73324    
emp     2.2427e-01  1.0923e-02 20.5328  < 2e-16 ***
wage   -1.9868e-03  1.1987e-02 -0.1657  0.86840    
output -6.0120e-03  2.9003e-03 -2.0729  0.03853 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:    615.32
Residual Sum of Squares: 384.35
R-Squared:      0.37537
Adj. R-Squared: 0.26886
F-statistic: 111.023 on 4 and 739 DF, p-value: < 2.22e-16

As you can see I get summary at the end, but I get also a lot of unnecessary numbers. Is there any way how these can be omitted ?

By redundant information I mean these numbers after firm = c(1,1,...). Question can be rephrased in such way : is there any possibility how data=list(...) can be deleted from

Call:
plm(formula = capital ~ sector + emp + wage + output, data = list

Allan Cameron · Accepted Answer

You need to quote your data frame name and have that in the passed parameters, without having the actual data frame in the parameters. However, you also need to make sure that your actual data frame is available within the pglm_env for when that quoted name comes to be evaluated.

fit_panel_lr <- function(y, x) {

  x        <- cbind(as.data.frame(x),as.data.frame(y))
  varnames <- names(x)[3:(length(x))]
  varnames <- varnames[!(varnames == names(y))]
  form     <- paste0(varnames, collapse = "+")
  x_copy   <- data.frame(x)
  form     <- as.formula(paste0(names(y), "~", form))

  params   <- list(
    formula = form, model = 'within', data = quote(data_source)
  )
  
  pglm_env <- list2env(params, envir = new.env())
  pglm_env$data_source <- x_copy

  model_plm <- do.call("plm", params, envir = pglm_env)

  summary(model_plm)
}

So now you can do:

data("EmplUK", package="plm")
dep_var <- EmplUK['capital']
df1     <- EmplUK[-6]
fit_panel_lr(dep_var, df1)

#> Oneway (individual) effect Within Model
#> 
#> Call:
#> plm(formula = capital ~ sector + emp + wage + output, data = data_source, 
#>     model = "within")
#> 
#> Unbalanced Panel: n = 140, T = 7-9, N = 1031
#> 
#> Residuals:
#>        Min.     1st Qu.      Median     3rd Qu.        Max. 
#> -15.4974740  -0.0708288   0.0034195   0.0744795   9.1816716 
#> 
#> Coefficients:
#>           Estimate  Std. Error t-value Pr(>|t|)    
#> emp     0.18715725  0.01552895 12.0522  < 2e-16 ***
#> wage    0.03372382  0.01610535  2.0940  0.03655 *  
#> output -0.00044427  0.00385459 -0.1153  0.90827    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Total Sum of Squares:    1152.7
#> Residual Sum of Squares: 979.98
#> R-Squared:      0.14986
#> Adj. R-Squared: 0.013909
#> F-statistic: 52.1761 on 3 and 888 DF, p-value: < 2.22e-16

How to omit redundant huge information?

Answers (1)

Related Questions