Reputation: 335
I have a dataset like this:
ID DATE NUMBER
1 01-01-20 1
1 15-01-20 2
1 24-02-20 3
2 15-03-20 1
2 20-04-20 2
2 01-05-20 3
and I want to add a new column that replicate the value of the first date for each ID, like:
ID DATE NUMBER NEW_DATE
1 01-01-20 1 01-01-20
1 15-01-20 2 01-01-20
1 24-02-20 3 01-01-20
2 15-03-20 1 15-03-20
2 20-04-20 2 15-03-20
2 01-05-20 3 15-03-20
How can I do it? Thanks everyone
Upvotes: 1
Views: 233
Reputation: 886948
We can use first
in dplyr
df1 %>%
group_by(ID) %>%
mutate(NEW_DATE = first(DATE))
Or another option in duplicated
df1$NEW_DATE <- df1$DATE[!duplicated(df1$ID)][cumsum(!duplicated(df1$ID))]
df1 <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), DATE = structure(c(1L,
3L, 6L, 4L, 5L, 2L), .Label = c("01-01-20", "01-05-20", "15-01-20",
"15-03-20", "20-04-20", "24-02-20"), class = "factor"), NUMBER = c(1L,
2L, 3L, 1L, 2L, 3L)), class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 0
Reputation: 76402
Here is a base R solution. ave
groups column DATE
by ID
and extracts the first element of each grouped vector.
df1$NEW_DATE <- with(df1, ave(DATE, ID, FUN = '[', 1))
# ID DATE NUMBER NEW_DATE
#1 1 01-01-20 1 01-01-20
#2 1 15-01-20 2 01-01-20
#3 1 24-02-20 3 01-01-20
#4 2 15-03-20 1 15-03-20
#5 2 20-04-20 2 15-03-20
#6 2 01-05-20 3 15-03-20
Data
df1 <- read.table(text = "
ID DATE NUMBER
1 01-01-20 1
1 15-01-20 2
1 24-02-20 3
2 15-03-20 1
2 20-04-20 2
2 01-05-20 3
", header = TRUE)
Upvotes: 2