Ophelia
Ophelia

Reputation: 29

Replace NA with minimum Group Value R

I'm struggeling with transforming my data and would appreciate some help

year name start
2010 Emma 1998
2011 Emma 1998
2012 Emma 1998
2009 John na
2010 John na
2012 John na
2007 Louis na
2012 Louis na

the aim is to replace all NAs with the minimum value in year for every name group so the data looks like this

year name start
2010 Emma 1998
2011 Emma 1998
2012 Emma 1998
2009 John 2009
2010 John 2009
2012 John 2009
2007 Louis 2007
2012 Louis 2007

Note: either all start values of one name group are NAs or none

I tried to use

mydf %>%   group_by(name) %>%   mutate(start= ifelse(is.na(start), min(year, na.rm = T), start))

but got this error

x `start` must return compatible vectors across groups

There are a lot of similar problems here. Some people here used the ave function or worked with data.table which both doesnt seem to fit my problem

My base function must be sth like

df$A <- ifelse(is.na(df$A), df$B, df$A)

however I cant seem to properly combine it with the min() and group by() function.

Thank you for any help

Upvotes: 0

Views: 925

Answers (2)

akrun
akrun

Reputation: 886938

We can use na.aggregate

library(dplyr)
library(zoo)
dat %>%
   group_by(name) %>%
   mutate(start = na.aggregate(na_if(start, "na"), FUN = min))

Upvotes: 1

user438383
user438383

Reputation: 6206

I changed the colname to 'Year' because it was colliding to

dat %>% 
    dplyr::group_by(name) %>% 
    dplyr::mutate(start = dplyr::if_else(start == "na", min(Year), start))
# A tibble: 8 x 3
# Groups:   name [3]
  Year  name  start
  <chr> <chr> <chr>
1 2010  Emma  1998 
2 2011  Emma  1998 
3 2012  Emma  1998 
4 2009  John  2009 
5 2010  John  2009 
6 2012  John  2009 
7 2007  Louis 2007 
8 2012  Louis 2007 

Upvotes: 1

Related Questions