Reputation: 29
I'm struggeling with transforming my data and would appreciate some help
year | name | start |
---|---|---|
2010 | Emma | 1998 |
2011 | Emma | 1998 |
2012 | Emma | 1998 |
2009 | John | na |
2010 | John | na |
2012 | John | na |
2007 | Louis | na |
2012 | Louis | na |
the aim is to replace all NAs with the minimum value in year for every name group so the data looks like this
year | name | start |
---|---|---|
2010 | Emma | 1998 |
2011 | Emma | 1998 |
2012 | Emma | 1998 |
2009 | John | 2009 |
2010 | John | 2009 |
2012 | John | 2009 |
2007 | Louis | 2007 |
2012 | Louis | 2007 |
Note: either all start values of one name group are NAs or none
I tried to use
mydf %>% group_by(name) %>% mutate(start= ifelse(is.na(start), min(year, na.rm = T), start))
but got this error
x `start` must return compatible vectors across groups
There are a lot of similar problems here. Some people here used the ave function or worked with data.table which both doesnt seem to fit my problem
My base function must be sth like
df$A <- ifelse(is.na(df$A), df$B, df$A)
however I cant seem to properly combine it with the min() and group by() function.
Thank you for any help
Upvotes: 0
Views: 925
Reputation: 886938
We can use na.aggregate
library(dplyr)
library(zoo)
dat %>%
group_by(name) %>%
mutate(start = na.aggregate(na_if(start, "na"), FUN = min))
Upvotes: 1
Reputation: 6206
I changed the colname to 'Year' because it was colliding to
dat %>%
dplyr::group_by(name) %>%
dplyr::mutate(start = dplyr::if_else(start == "na", min(Year), start))
# A tibble: 8 x 3
# Groups: name [3]
Year name start
<chr> <chr> <chr>
1 2010 Emma 1998
2 2011 Emma 1998
3 2012 Emma 1998
4 2009 John 2009
5 2010 John 2009
6 2012 John 2009
7 2007 Louis 2007
8 2012 Louis 2007
Upvotes: 1