spidermarn
spidermarn

Reputation: 939

Fill columns with most recent value

I have a dataset like this in R:

Date       | ID  | Age |
2019-11-22 | 1   | 5   |
2018-12-21 | 1   | 4   |
2018-05-09 | 1   | 4   |
2018-05-01 | 2   | 5   |
2017-10-10 | 2   | 4   |
2017-07-21 | 1   | 3   |

How do I change the Age values of each group of ID to the most recent Age record?

Results should look like this:

Date       | ID  | Age |
2019-11-22 | 1   | 5   |
2018-12-21 | 1   | 5   |
2018-05-09 | 1   | 5   |
2018-05-01 | 2   | 5   |
2017-10-10 | 2   | 5   |
2017-07-21 | 1   | 5   |

I tried group_by(ID)%>% mutate(Age = max(Date, Age)) but it seems to be giving strange huge numbers for certain cases when I try it on a v huge dataset. What could be going wrong?

Upvotes: 0

Views: 433

Answers (2)

Matt
Matt

Reputation: 7385

I think the issue is in your mutate function:

Try this:

    group_by(ID) %>% 
arrange(as.date(Date) %>%
mutate(Age = max(Age))

Upvotes: 0

Sotos
Sotos

Reputation: 51582

Try sorting first,

df %>% 
 arrange(as.Date(Date)) %>% 
 group_by(ID) %>% 
 mutate(Age = last(Age))

which gives,

# A tibble: 6 x 3
# Groups:   ID [2]
  Date          ID   Age
  <fct>      <int> <int>
1 2017-07-21     1     5
2 2017-10-10     2     5
3 2018-05-01     2     5
4 2018-05-09     1     5
5 2018-12-21     1     5
6 2019-11-22     1     5

Upvotes: 1

Related Questions