Nick M.
Nick M.

Reputation: 27

Flag first year within a group in R

I have a data frame structured as follows:

+----------+------+
| ID       | year |
+----------+------+
| 1        | 2002 |
| 1        | 2003 |
| 1        | 2004 |
| 2        | 2015 |
| 2        | 2016 |
| 2        | 2017 |
| 2        | 2018 |
| 3        | 2004 |
| 3        | 2005 |
+----------+------+

I would like to add a variable which flags the first (or earliest) occurrence within ID to get the following:

+----------+------+------+
| ID       | year | flag | 
+----------+------+------+
| 1        | 2002 | 1    |
| 1        | 2003 | 0    | 
| 1        | 2004 | 0    |
| 2        | 2015 | 1    |
| 2        | 2016 | 0    |
| 2        | 2017 | 0    |
| 2        | 2018 | 0    |
| 3        | 2004 | 1    | 
| 3        | 2005 | 0    |
+----------+------+------+

Is there an easy way to do this in dplyr?

Upvotes: 0

Views: 192

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101628

Another base R option using ave

transform(
  df,
  flag = ave(1:nrow(df),ID, FUN = function(x) seq_along(x)==1)
)

Upvotes: 0

akrun
akrun

Reputation: 887153

With dplyr, we can group by 'ID' and create a logical vector based on the min value of 'year', coerce it to binary with +

df1 %>%
   group_by(ID) %>%
   mutate(flag = +(year == min(year))

If the data is already ordered

df1 %>%
    mutate(flag = !duplicated(ID))

Or if the 'year' is already ordered

df1$flag <- !duplicated(df1$ID)

Upvotes: 4

Related Questions