nzhanggh
nzhanggh

Reputation: 131

MICE imputation troubleshooting in R with Categorical Variables

I have been trying to experiment with MICE on data from Kaggle but have been having trouble with imputation of a categorical variable. I was working on this notebook - https://www.kaggle.com/rtatman/animal-bites and was trying to predict the species (SpeciesIDDesc). However, none of the NA values are changed after I run MICE. Below is the code I have right now.

library(tidyverse)
library(lubridate)
library(mice)

#kaggle link with data - https://www.kaggle.com/rtatman/animal-bites
data <- read_csv("Health_AnimalBites.csv", 
                 col_types = list(BreedIDDesc = col_character(), 
                                  release_date = col_datetime()))

data_mice_one <- data %>%
  filter(!is.na(victim_zip), 
         !is.na(bite_date), 
         !is.na(victim_zip), 
         !is.na(WhereBittenIDDesc)) %>%
  mutate(month = month(bite_date, label = TRUE)) %>%
  select(SpeciesIDDesc, 
         victim_zip, 
         month)

imputed_data_one <- mice(data_mice_one, diagnostics = FALSE, remove_collinear = FALSE, meth="polyreg")
imputed_data_one <- complete(imputed_data_one)
view(imputed_data_one)

sum(is.na(imputed_data_one$SpeciesIDDesc))

I also get a warning message after running 'imputed_data_one <- mice(data_mice_one, diagnostics = FALSE, remove_collinear = FALSE, meth="polyreg")' which says "Warning message: Number of logged events: 2" Upon investigating the logged events here is what I get -
it im dep meth out 1 0 0 constant SpeciesIDDesc 2 0 0 constant victim_zip

How do I fix my code? Am I using MICE incorrectly?

Upvotes: 1

Views: 1394

Answers (1)

nzhanggh
nzhanggh

Reputation: 131

I just realized I forgot to convert SpeciesIDDesc and month into factors. The code works now

Upvotes: 3

Related Questions