Reputation: 1
I need to determine the influence of soil properties (predictors) on soluble heavy metals (response variables) and quantify the proportion of deviance explained by each predictors.
I am new to Stack Overflow and in this statistical approach, so I apologize in advance if my questions lack the basics.
The simple Spearman correlation provides information about the potential correlation between predictors and the response variable. However, is it true that the results from GAM or GLM are more robust than those from a Spearman correlation?
I attempted to use GAM with a Tweedie distribution. I'm wondering if I can extract the deviance explained by each predictor by simply running the GAM with a single predictor as follows:
> model <- gam(ZnmgKg ~ s(PJRC, k=10), data = lucas.dat, method = 'REML', family= tw(link = "log"))
> summary(model)
Family: Tweedie(p=1.99)
Link function: log
Formula:
ZnmgKg ~ s(PJRC, k = 10)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4782 0.0139 34.39 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(PJRC) 4.795 5.857 179.3 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# R-sq.(adj) = 0.244 **Deviance explained = 31.5%**
-REML = 3145.5 Scale est. = 0.45613 n = 2348
> model <- gam(ZnmgKg ~ s(clay, k=10), data = lucas.dat, method = 'REML', family= tw(link = "log"))
> summary(model)
Family: Tweedie(p=1.99)
Link function: log
Formula:
ZnmgKg ~ s(clay, k = 10)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.57645 0.01622 35.55 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(clay) 1.512 1.872 62.03 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# R-sq.(adj) = 0.0249 **Deviance explained = 4.38%**
-REML = 3556 Scale est. = 0.61891 n = 2340
Is it statistically correct if I simply report in the manuscript that 31.5% of the deviance in ZnmgKg is explained by PJRC and 4.38% by clay, and similarly for the rest of the predictors?
I performed gam.check(model, rep=1000), and the tails in the "theoretical quantiles" are poor, and the "Response vs. Fitted Values" plot does not follow the 1:1 line (see figure below for clay). Does this indicate that the fitted model is not a good fit? Does it matter when I am not actually modeling but rather trying to understand the importance of each predictor? gam.check(model, rep=1000)
While there is an option to include all predictors in the model, it does not provide separate deviance values for each predictor (or I do not know how to extract them!). Is there a proper way to identify how much of the deviance is explained by each predictor?
> model <- gam(ZnmgKg ~ Plant + s(clay, k = 10) + s(pH_CaCl2, k = 10) + s(OC, k = 10) + s(PJRC, k = 10), data = lucas.dat, method = 'REML', family= tw(link = "log"))
> #EVALUATING THE FITTED MODEL
> summary(model)
Family: Tweedie(p=1.99)
Link function: log
Formula:
ZnmgKg ~ Plant + s(clay, k = 10) + s(pH_CaCl2, k = 10) + s(OC,
k = 10) + s(PJRC, k = 10)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.342545 0.023584 14.524 < 2e-16 ***
PlantB20 0.136277 0.059755 2.281 0.022663 *
PlantB30 -0.001029 0.051469 -0.020 0.984058
PlantB40 0.198377 0.064552 3.073 0.002143 **
PlantB50 0.126916 0.051508 2.464 0.013812 *
PlantB70 0.430686 0.060898 7.072 2.01e-12 ***
PlantB80 0.202794 0.067635 2.998 0.002743 **
PlantE00 0.137403 0.038144 3.602 0.000322 ***
PlantF40 -0.102245 0.066300 -1.542 0.123172
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(clay) 5.962 7.041 8.748 <2e-16 ***
s(pH_CaCl2) 7.244 8.108 39.145 <2e-16 ***
s(OC) 5.870 7.023 9.410 <2e-16 ***
s(PJRC) 5.093 6.194 117.888 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.296 **Deviance explained = 44.1%**
-REML = 2942.1 Scale est. = 0.37912 n = 2340
Upvotes: 0
Views: 39