Reputation: 93
I have a panel data set that looks like this:
df1 <- data.frame(date = c("2020-01-01", "2020-01-02", "2020-01-03", "2020-01-01", "2020-01-02", "2020-01-03", "2020-01-01", "2020-01-02", "2020-01-03"),
ID = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
price = c(102, 103, 107, 95, 96, 98, 77, 76, 72),
dummy = c(0, 1, 0, 0, 1, 0, 0, 1, 0))
date ID price dummy
1 2020-01-01 A 102 0
2 2020-01-02 A 103 1
3 2020-01-03 A 107 0
4 2020-01-01 B 95 0
5 2020-01-02 B 96 1
6 2020-01-03 B 98 0
7 2020-01-01 C 77 0
8 2020-01-02 C 76 1
9 2020-01-03 C 72 0
I have turned it into panel data using the following code:
df1 <- pdata.frame(df1, index = c("price", "date")) #changed to panel data
df1 <- tibble::rownames_to_column(df1, "date2") #turned numbered row names into column
df1 <- df1 %>%
arrange(ID, date) #ordered first by ID, then date
I now want to run a fixed-effect linear regression, essentially mirroring the xtreg, fe
function in Stata.
I have tried the following code, but keep receiving error messages:
fixed <- plm(price ~ dummy,
data = treated_panel,
model = "within")
Error in as.character.factor(x) : malformed factor
How can I run a fixed-effect regression on my panel data?
Upvotes: 1
Views: 393
Reputation: 72813
Unlike Stata, using plm
you may define unit and time variables directly in the index=
argument, which saves you the tedious definition of a pdata.frame. Notice, that you need effect='twoways'
if you want unit and time FE.
library(plm)
fit <- plm(price ~ dummy + X, data=df1, index=c('ID', 'date'), model="within", effect='twoways')
To get robust standard errors, the summary
method for plm has a vcov=
argument.
summary(fit, vcov=plm::vcovHC(fit))
Note, that I added an X variable to the toy data to make this work.
Data:
df1 <- structure(list(date = c("2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01", "2020-01-02", "2020-01-03", "2020-01-01", "2020-01-02",
"2020-01-03"), ID = c("A", "A", "A", "B", "B", "B", "C", "C",
"C"), price = c(102, 103, 107, 95, 96, 98, 77, 76, 72), X = c(0.391173265408725,
0.35144685767591, 0.0459533138200641, 0.626689063152298, 0.523385446285829,
0.945963381789625, 0.935278508113697, 0.289080709218979, 0.111053846077994
), dummy = c(0, 1, 0, 0, 1, 0, 0, 1, 0)), class = "data.frame", row.names = c(NA,
-9L))
Upvotes: 1