Reputation: 939
I have a dataset like this in R:
Date | ID | Age |
2019-11-22 | 1 | 5 |
2018-12-21 | 1 | 4 |
2018-05-09 | 1 | 4 |
2018-05-01 | 2 | 5 |
2017-10-10 | 2 | 4 |
2017-07-21 | 1 | 3 |
How do I change the Age values of each group of ID to the most recent Age record?
Results should look like this:
Date | ID | Age |
2019-11-22 | 1 | 5 |
2018-12-21 | 1 | 5 |
2018-05-09 | 1 | 5 |
2018-05-01 | 2 | 5 |
2017-10-10 | 2 | 5 |
2017-07-21 | 1 | 5 |
I tried group_by(ID)%>% mutate(Age = max(Date, Age))
but it seems to be giving strange huge numbers for certain cases when I try it on a v huge dataset. What could be going wrong?
Upvotes: 0
Views: 433
Reputation: 7385
I think the issue is in your mutate function:
Try this:
group_by(ID) %>%
arrange(as.date(Date) %>%
mutate(Age = max(Age))
Upvotes: 0
Reputation: 51582
Try sorting first,
df %>%
arrange(as.Date(Date)) %>%
group_by(ID) %>%
mutate(Age = last(Age))
which gives,
# A tibble: 6 x 3 # Groups: ID [2] Date ID Age <fct> <int> <int> 1 2017-07-21 1 5 2 2017-10-10 2 5 3 2018-05-01 2 5 4 2018-05-09 1 5 5 2018-12-21 1 5 6 2019-11-22 1 5
Upvotes: 1