이찬중
이찬중

Reputation: 1

How can I fill up NA with multi-group by median or mode in R

I need to fill each of NAs up in a dataframe with 2-3 groups using the median or mode values in R.

Actually, I was trying to impute NA into group by median for numerical variables and group by mode for factor variables.

I searched the site but could not find any appropriate suggestions to help me.

Some of the answers suggested to impute whole NA or only a variable one at a time. My data frame has more than 40 columns.

If anybody can solve it perspicuously, I would be very grateful.

Here's my rough code, which is not working though.

fillna_cols <- c(d,e,f,g,h...)

df %>% 
  group_by(a,b,c) %>% 
  mutate_at(fillna_cols, na.aggregate(df,FUN = median))

Upvotes: 0

Views: 227

Answers (1)

MatthewR
MatthewR

Reputation: 2770

Fabricating some data

mtcars[ c(4,5,9) , "wt" ] <- NA

Take a look

head( mtcars)

Over write missings with the mean

mtcars[ is.na( mtcars$wt) , "wt"] <- mean( mtcars$wt , na.rm=T)

Or the median by a group

mtcars[ is.na( mtcars$wt) &mtcars$am %in%0 , "wt"] <- quantile( mtcars[ mtcars$am%in%0 , "wt"] , .5, na.rm=T)

mtcars[ is.na( mtcars$wt) &mtcars$am %in%1 , "wt"] <- quantile( mtcars[ mtcars$am%in%1 , "wt"] , .5, na.rm=T)

Or a data table solution

library( data.table)
mtcars <- data.table( mtcars)
#median within cyl/am cells
mtcars[ , median := quantile( wt , .5 , na.rm=T) , by= .(cyl, am)] 
mtcars[ , impwt := ifelse( is.na( wt) , median , wt) ]

Upvotes: 2

Related Questions