Reputation: 73
I am running tests on missing data using the finalfit
package.
I have a dataset that has 11,046 obs and 27 variables. I have more than one dependent variable because I need later to develop a Confirmatory factor analysis with lavaan. The dataset can be found here.
explanatory_edu <- c("ch_edu", "a4g_4")
dependent <- "br_logical"
sl_cfa %>%
missing_compare(dependent, explanatory_edu)
I get the following error message:
Error in factor(g, levels = unique(g)) : object 'g' not found
What is this g
object the error is referring to?
This is the output of ff_glimpse
> sl_cfa %>%
+ ff_glimpse(dependent, explanatory_edu)
Continuous
# A tibble: 11,046 x 0
Categorical
label var_type n missing_n missing_percent levels_n levels levels_count
br_logical br_logical <lgl> 6398 4648 42.1 2 - -
ch_edu ch_edu <lgl> 11046 0 0.0 2 - -
a4g_4 a4g_4 <lgl> 8723 2323 21.0 2 - -
levels_percent
br_logical -
ch_edu -
a4g_4 -
Not sure it is helpful, but I do not get any error with missing_pairs
sl_cfa %>%
missing_pairs(dependent, explanatory_edu, position = "fill")
which gives me this plot
where we can see that ch_edu
seems to be MAR and a4g_4
seems to be MCAR.
PS
Would anyone with the adequate reputation create tags for finalfit
and missing_compare
function, please? Many thanks
Upvotes: 1
Views: 340
Reputation: 1381
Thanks.
An underlying function used here is being re-written and the problems caused by tibbles will likely go away.
For now you can:
library(finalfit)
explanatory_edu <- c("ch_edu", "a4g_4")
sl_cfa %>%
data.frame() %>%
missing_compare(dependent, explanatory_edu)
#> Missing data analysis: br_logical Not missing Missing p
#> ch_edu FALSE 4203 (65.7) 1730 (37.2) <0.001
#> TRUE 2195 (34.3) 2918 (62.8)
#> a4g_4 FALSE 2548 (50.5) 1774 (48.2) 0.034
#> TRUE 2496 (49.5) 1905 (51.8)
As you mention, missingness in the variable ch_edu
is strongly associated with br_logical
.
Upvotes: 1