Reputation: 31
I'm working of a regression of how income is impacted by the occurrence of natural disasters. After running AC and Heteroskedasticity tests I need to apply NW standard errors to my model.
My model output using stargaze skips the interaction terms associated with the Flood dummy (see regression output at end)
I attempted the model with only the Flood dummy interaction terms and these were all significant. Also when not interacted both terms are significant.
Any help on how to get the coefficients on these would be highly appreciated!
Here is my code:
model_pooled_ols_5e <- lm(log_Average_Wage_and_Salary_Income ~ Lagged1y_Disaster_Dummy_Storm + Lagged1y_Disaster_Dummy_Tropical_Cyclone_Storm + Lagged1y_Disaster_Dummy_Wildfire + Lagged1y_Disaster_Dummy_Flood + Intensity_Score + Duration_Under_1_Week_Dummy + Duration_1_to_3_Weeks_Dummy + Duration_1_Month_Dummy + Duration_Over_1_Month_Dummy + Inner_Regional_Dummy + Major_Cities_Dummy + Outer_Regional_Dummy + Remote_Very_Remote_Dummy + Interaction_Storm_Inner_Regional + Interaction_Storm_Major_Cities + Interaction_Storm_Outer_Regional + Interaction_Storm_Remote_Very_Remote + Interaction_Tropical_Cyclone_Inner_Regional + Interaction_Tropical_Cyclone_Major_Cities + Interaction_Tropical_Cyclone_Outer_Regional + Interaction_Tropical_Cyclone_Remote_Very_Remote + Interaction_Wildfire_Inner_Regional + Interaction_Wildfire_Major_Cities + Interaction_Wildfire_Outer_Regional + Interaction_Wildfire_Remote_Very_Remote + Interaction_Flood_Inner_Regional + Interaction_Flood_Major_Cities + Interaction_Flood_Outer_Regional + Interaction_Flood_Remote_Very_Remote + Year, data=Merged_ABS_EMDAT_simple_duplicates_cleaned_v5)
Lagged1y_Disaster_Dummy_Storm -0.038
(0.036)
Lagged1y_Disaster_Dummy_Tropical_Cyclone_Storm 0.162***
(0.032)
Lagged1y_Disaster_Dummy_Wildfire 0.017
(0.021)
Lagged1y_Disaster_Dummy_Flood -0.099***
(0.024)
Intensity_Score -0.035
(0.065)
Duration_Under_1_Week_Dummy -0.116***
(0.025)
Duration_1_to_3_Weeks_Dummy -0.063***
(0.023)
Duration_1_Month_Dummy -0.086***
(0.020)
Duration_Over_1_Month_Dummy
Inner_Regional_Dummy 0.019
(0.045)
Major_Cities_Dummy 0.194***
(0.044)
Outer_Regional_Dummy 0.008
(0.044)
Remote_Very_Remote_Dummy
Interaction_Storm_Inner_Regional -0.038
(0.024)
Interaction_Storm_Major_Cities -0.079***
(0.021)
Interaction_Storm_Outer_Regional 0.076**
(0.037)
Interaction_Storm_Remote_Very_Remote -0.158***
(0.048)
Interaction_Tropical_Cyclone_Inner_Regional 0.180***
(0.042)
Interaction_Tropical_Cyclone_Major_Cities 0.023
(0.025)
Interaction_Tropical_Cyclone_Outer_Regional 0.146***
(0.026)
Interaction_Tropical_Cyclone_Remote_Very_Remote 0.319***
(0.069)
Interaction_Wildfire_Inner_Regional 0.054**
(0.026)
Interaction_Wildfire_Major_Cities -0.036
(0.023)
Interaction_Wildfire_Outer_Regional 0.109***
(0.035)
Interaction_Wildfire_Remote_Very_Remote -0.208***
(0.055)
Interaction_Flood_Inner_Regional
Interaction_Flood_Major_Cities
Interaction_Flood_Outer_Regional
Interaction_Flood_Remote_Very_Remote
Year 0.030***
(0.002)
Constant -49.791***
(3.301)
Note: *p<0.1; **p<0.05; ***p<0.01
interaction_terms <- c("Interaction_Flood_Inner_Regional", "Interaction_Flood_Major_Cities", "Interaction_Flood_Outer_Regional", "Interaction_Flood_Remote_Very_Remote")
correlation_matrix_interactions <- correlation_matrix_all[interaction_terms, ] print(correlation_matrix_interactions)
log_Average_Wage_and_Salary_Income Lagged1y_Disaster_Dummy_Storm
Interaction_Flood_Inner_Regional -0.18158278 -0.03057133 Interaction_Flood_Major_Cities 0.09542833 -0.05145906 Interaction_Flood_Outer_Regional -0.18727074 -0.02185281 Interaction_Flood_Remote_Very_Remote -0.08479981 -0.01270038
Lagged1y_Disaster_Dummy_Tropical_Cyclone_Storm
Interaction_Flood_Inner_Regional -0.08403034 Interaction_Flood_Major_Cities -0.14144371 Interaction_Flood_Outer_Regional -0.10310793 Interaction_Flood_Remote_Very_Remote -0.03490910
Lagged1y_Disaster_Dummy_Wildfire Lagged1y_Disaster_Dummy_Flood
Interaction_Flood_Inner_Regional -0.09474129 0.04440241 Interaction_Flood_Major_Cities -0.11059196 -0.08919554 Interaction_Flood_Outer_Regional -0.12563254 0.30334249 Interaction_Flood_Remote_Very_Remote -0.04422148 0.09880484
Intensity_Score Duration_Under_1_Week_Dummy
Interaction_Flood_Inner_Regional -0.13298999 0.015142296 Interaction_Flood_Major_Cities -0.36021593 -0.011496344 Interaction_Flood_Outer_Regional -0.02933677 -0.055770679 Interaction_Flood_Remote_Very_Remote -0.03530315 0.006790861
Duration_1_to_3_Weeks_Dummy Duration_1_Month_Dummy
Interaction_Flood_Inner_Regional 0.02727324 0.008551952 Interaction_Flood_Major_Cities 0.20987979 -0.168472347 Interaction_Flood_Outer_Regional 0.03265276 -0.035494128 Interaction_Flood_Remote_Very_Remote -0.05184300 0.014811205
Duration_Over_1_Month_Dummy Inner_Regional_Dummy
Interaction_Flood_Inner_Regional -0.07894394 0.70226950 Interaction_Flood_Major_Cities -0.15504594 -0.22967180 Interaction_Flood_Outer_Regional 0.06066184 -0.16742337 Interaction_Flood_Remote_Very_Remote 0.06464149 -0.05668429
Major_Cities_Dummy Outer_Regional_Dummy Remote_Very_Remote_Dummy
Interaction_Flood_Inner_Regional -0.3672568 -0.16640793 -0.05299021 Interaction_Flood_Major_Cities 0.4391790 -0.28010542 -0.08919554 Interaction_Flood_Outer_Regional -0.4506359 0.70655484 -0.06502069 Interaction_Flood_Remote_Very_Remote -0.1525711 -0.06913159 0.75122639
Interaction_Storm_Inner_Regional Interaction_Storm_Major_Cities
Interaction_Flood_Inner_Regional -0.04402616 -0.12742112 Interaction_Flood_Major_Cities -0.07410684 -0.21448105 Interaction_Flood_Outer_Regional -0.05402151 -
Upvotes: 3
Views: 92
Reputation: 6887
Without a minimal working example it is very difficult to give an answer, but I suspect that what is happening is that you are referring to categorical variables and for these one of the level of those variables is absent. If that is the case, then this is normal/expected behaviour - the "missing" estimates form part of the intercept. Check out the package emmeans
Here is an example of this in action:
library(ggplot2)
library(emmeans)
set.seed(123)
# Number of observations per group
n <- 30
# Create the 'colour' factor variable
colour <- factor(rep(c("Red", "Green", "Blue"), each = n))
# Simulate the response variable 'value' with different means for each group
value <- c(rnorm(n, mean = 5, sd = 1), # Red
rnorm(n, mean = 6, sd = 1), # Green
rnorm(n, mean = 7, sd = 1)) # Blue
data <- data.frame(colour, value)
So, we have created a dataset with one independent variable "colour", having 3 levels, Red, Green and Blue. First, we plot the data to visualize the differences (it's always a good idea to visualise your data)
ggplot(data, aes(x = colour, y = value, fill = colour)) +
geom_boxplot() +
labs(title = "Boxplot of Simulated Data by Colour",
x = "Colour",
y = "Value") +
theme_minimal()
And now we fit the model:
m0 <- lm(value ~ colour, data = data)
summary(m0)
which produces:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.0244 0.1639 42.869 < 2e-16 ***
colourGreen -0.8461 0.2317 -3.651 0.000445 ***
colourRed -2.0715 0.2317 -8.939 5.98e-14 ***
...and we see that the estimate for Blue is missing.
Now we estimate marginal means:
em_means <- emmeans(m0, ~ colour)
summary(em_means)
which gives us:
colour emmean SE df lower.CL upper.CL
Blue 7.02 0.164 87 6.70 7.35
Green 6.18 0.164 87 5.85 6.50
Red 4.95 0.164 87 4.63 5.28
which recovers the "missing" estimate for Blue
Upvotes: 0