Reputation: 55
Simple question but before estimating a FE regression using plm – do I need to "set" the df as panel data using plm.data (similar to xtset in Stata)?
pdata <- plm.data(df, index = "state", "year")
I thought including "index" in the regression takes care of the FE? e.g.
Model1 <- plm(DV ~ IV + IV2,
data = df,
index = c("state", "year"),
model="within")
stargazer(Model1, type = 'text', align = TRUE, title = "Regression Results")
This source claims: "The current version of plm is capable of working with a regular data.frame without any further transformation, provided that the individual and time indexes are in the first two columns, as in all the example data sets but Wages. If this weren’t the case, an index optional argument would have to be passed on to the estimating and testing functions." https://cran.r-project.org/web/packages/plm/vignettes/plmPackage.html
Upvotes: 1
Views: 1112
Reputation: 73592
You can do this before the plm
command using pdata.frame()
(plm.data()
is outdated), or more simply (and unlike in Stata) in the plm()
call itself. Example:
library(plm)
data("Grunfeld", package="plm")
class(Grunfeld)
# [1] "data.frame"
head(Grunfeld, 3)
# firm year inv value capital
# 1 1 1935 317.6 3078.5 2.8
# 2 1 1936 391.8 4661.7 52.6
# 3 1 1937 410.6 5387.1 156.9
plm
expects for the first two columns the group and time data. So when you use the Grunfeld
example data above for a FE regression without specifying the index it will work.
wi1 <- plm(inv ~ value + capital,
data=Grunfeld, model="within", effect="twoways")
wi1$coe
# value capital
# 0.1177159 0.3579163
When you confuse the columns, though, an error will occur.
## confuse columns
Grunfeld2 <- Grunfeld[c(3:5, 2,1)]
head(Grunfeld2, 3)
# inv value capital year firm
# 1 317.6 3078.5 2.8 1935 1
# 2 391.8 4661.7 52.6 1936 1
# 3 410.6 5387.1 156.9 1937 1
plm(inv ~ value + capital,
data=Grunfeld2, model="within", effect="twoways")
# Error in plm.fit [...]
We need to specify the index=c(<group>, <time>)
either in the plm
call,
wi2 <- plm(inv ~ value + capital, index=c("firm", "year"),
data=Grunfeld2, model="within", effect="twoways")
wi2$coe
# value capital
# 0.1177159 0.3579163
or by generating a "pdata.frame"
.
Grunfeld3 <- pdata.frame(Grunfeld2, index=c("firm", "year"))
class(Grunfeld3)
# [1] "pdata.frame" "data.frame"
The order of the columns won't be changed, the index
is rather stored in the attributes. You may want to compare attributes(Grunfeld2)
and attributes(Grunfeld3)
.
wi3 <- plm(inv ~ value + capital,
data=Grunfeld3, model="within", effect="twoways")
wi3$coe
# value capital
# 0.1177159 0.3579163
The results wi1
, wi2
and wi3
are the same. There are some consequences, though, because the row names of the "pdata.frame"
correspond to group-time:
head(Grunfeld3, 3)
# inv value capital year firm
# 1-1935 317.6 3078.5 2.8 1935 1
# 1-1936 391.8 4661.7 52.6 1936 1
# 1-1937 410.6 5387.1 156.9 1937 1
Thus, all.equal
throws string mismatches,
all.equal(wi2, wi3)
# [1] "Component “residuals”: Names: 200 string mismatches"
# [2] "Component “model”: Attributes: < Component “row.names”: 200 string mismatches >"
# [3] "Component “call”: target, current do not match when deparsed"
the values are the same, though:
head(wi2$residuals)
# 1 2 3 4 5 6
# 41.10980 -69.68476 -152.11391 -19.73566 -93.36168 -28.48560
head(wi3$residuals)
# 1-1935 1-1936 1-1937 1-1938 1-1939 1-1940
# 41.10980 -69.68476 -152.11391 -19.73566 -93.36168 -28.48560
head(wi2$model, 3)
# inv value capital
# 1 317.6 3078.5 2.8
# 2 391.8 4661.7 52.6
# 3 410.6 5387.1 156.9
head(wi3$model, 3)
# inv value capital
# 1-1935 317.6 3078.5 2.8
# 1-1936 391.8 4661.7 52.6
# 1-1937 410.6 5387.1 156.9
wi2$call
# plm(formula = inv ~ value + capital, data = Grunfeld2, effect = "twoways",
# model = "within", index = c("firm", "year"))
wi3$call
# plm(formula = inv ~ value + capital, data = Grunfeld3, effect = "twoways",
# model = "within")
Upvotes: 2