Anna Bokun
Anna Bokun

Reputation: 55

Need to declare panel data before using plm?

Simple question but before estimating a FE regression using plm – do I need to "set" the df as panel data using plm.data (similar to xtset in Stata)?

pdata <- plm.data(df, index = "state", "year")  

I thought including "index" in the regression takes care of the FE? e.g.

 Model1 <- plm(DV ~ IV + IV2, 
              data = df,
              index = c("state", "year"),
              model="within") 

    stargazer(Model1, type = 'text', align = TRUE, title = "Regression Results")

This source claims: "The current version of plm is capable of working with a regular data.frame without any further transformation, provided that the individual and time indexes are in the first two columns, as in all the example data sets but Wages. If this weren’t the case, an index optional argument would have to be passed on to the estimating and testing functions." https://cran.r-project.org/web/packages/plm/vignettes/plmPackage.html

Upvotes: 1

Views: 1112

Answers (1)

jay.sf
jay.sf

Reputation: 73592

You can do this before the plm command using pdata.frame() (plm.data() is outdated), or more simply (and unlike in Stata) in the plm() call itself. Example:

library(plm)
data("Grunfeld", package="plm")

class(Grunfeld)
# [1] "data.frame"

head(Grunfeld, 3)
#   firm year   inv  value capital
# 1    1 1935 317.6 3078.5     2.8
# 2    1 1936 391.8 4661.7    52.6
# 3    1 1937 410.6 5387.1   156.9

plm expects for the first two columns the group and time data. So when you use the Grunfeld example data above for a FE regression without specifying the index it will work.

wi1 <- plm(inv ~ value + capital,
           data=Grunfeld, model="within", effect="twoways")
wi1$coe
#     value   capital 
# 0.1177159 0.3579163 

When you confuse the columns, though, an error will occur.

## confuse columns
Grunfeld2 <- Grunfeld[c(3:5, 2,1)]

head(Grunfeld2, 3)
#     inv  value capital year firm
# 1 317.6 3078.5     2.8 1935    1
# 2 391.8 4661.7    52.6 1936    1
# 3 410.6 5387.1   156.9 1937    1

plm(inv ~ value + capital,
    data=Grunfeld2, model="within", effect="twoways")
# Error in plm.fit [...]

We need to specify the index=c(<group>, <time>) either in the plm call,

wi2 <- plm(inv ~ value + capital, index=c("firm", "year"),
           data=Grunfeld2, model="within", effect="twoways")
wi2$coe
#     value   capital 
# 0.1177159 0.3579163 

or by generating a "pdata.frame".

Grunfeld3 <- pdata.frame(Grunfeld2, index=c("firm", "year"))  
class(Grunfeld3)
# [1] "pdata.frame" "data.frame" 

The order of the columns won't be changed, the index is rather stored in the attributes. You may want to compare attributes(Grunfeld2) and attributes(Grunfeld3).

wi3 <- plm(inv ~ value + capital,
           data=Grunfeld3, model="within", effect="twoways")
wi3$coe
#     value   capital 
# 0.1177159 0.3579163 

The results wi1, wi2 and wi3 are the same. There are some consequences, though, because the row names of the "pdata.frame" correspond to group-time:

head(Grunfeld3, 3)
#          inv  value capital year firm
# 1-1935 317.6 3078.5     2.8 1935    1
# 1-1936 391.8 4661.7    52.6 1936    1
# 1-1937 410.6 5387.1   156.9 1937    1

Thus, all.equal throws string mismatches,

all.equal(wi2, wi3)
# [1] "Component “residuals”: Names: 200 string mismatches"                            
# [2] "Component “model”: Attributes: < Component “row.names”: 200 string mismatches >"
# [3] "Component “call”: target, current do not match when deparsed"     

the values are the same, though:

head(wi2$residuals)
#        1          2          3          4          5          6 
# 41.10980  -69.68476 -152.11391  -19.73566  -93.36168  -28.48560 
head(wi3$residuals)
#   1-1935     1-1936     1-1937     1-1938     1-1939     1-1940 
# 41.10980  -69.68476 -152.11391  -19.73566  -93.36168  -28.48560 

head(wi2$model, 3)
#     inv  value capital
# 1 317.6 3078.5     2.8
# 2 391.8 4661.7    52.6
# 3 410.6 5387.1   156.9

head(wi3$model, 3)
#          inv  value capital
# 1-1935 317.6 3078.5     2.8
# 1-1936 391.8 4661.7    52.6
# 1-1937 410.6 5387.1   156.9

wi2$call
# plm(formula = inv ~ value + capital, data = Grunfeld2, effect = "twoways", 
#     model = "within", index = c("firm", "year"))
wi3$call
# plm(formula = inv ~ value + capital, data = Grunfeld3, effect = "twoways", 
#     model = "within")

Upvotes: 2

Related Questions