jeff
jeff

Reputation: 335

Replicate a given value for each ID in R

I have a dataset like this:

ID     DATE     NUMBER
1    01-01-20     1
1    15-01-20     2
1    24-02-20     3
2    15-03-20     1
2    20-04-20     2
2    01-05-20     3

and I want to add a new column that replicate the value of the first date for each ID, like:

ID     DATE     NUMBER   NEW_DATE
1    01-01-20     1      01-01-20
1    15-01-20     2      01-01-20
1    24-02-20     3      01-01-20
2    15-03-20     1      15-03-20
2    20-04-20     2      15-03-20
2    01-05-20     3      15-03-20

How can I do it? Thanks everyone

Upvotes: 1

Views: 233

Answers (2)

akrun
akrun

Reputation: 886948

We can use first in dplyr

df1 %>%
     group_by(ID) %>%
     mutate(NEW_DATE = first(DATE))

Or another option in duplicated

df1$NEW_DATE <- df1$DATE[!duplicated(df1$ID)][cumsum(!duplicated(df1$ID))]

data

df1 <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), DATE = structure(c(1L, 
3L, 6L, 4L, 5L, 2L), .Label = c("01-01-20", "01-05-20", "15-01-20", 
"15-03-20", "20-04-20", "24-02-20"), class = "factor"), NUMBER = c(1L, 
2L, 3L, 1L, 2L, 3L)), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 0

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a base R solution. ave groups column DATE by ID and extracts the first element of each grouped vector.

df1$NEW_DATE <- with(df1, ave(DATE, ID, FUN = '[', 1))

#  ID     DATE NUMBER NEW_DATE
#1  1 01-01-20      1 01-01-20
#2  1 15-01-20      2 01-01-20
#3  1 24-02-20      3 01-01-20
#4  2 15-03-20      1 15-03-20
#5  2 20-04-20      2 15-03-20
#6  2 01-05-20      3 15-03-20

Data

df1 <- read.table(text = "
ID     DATE     NUMBER
1    01-01-20     1
1    15-01-20     2
1    24-02-20     3
2    15-03-20     1
2    20-04-20     2
2    01-05-20     3
", header = TRUE)

Upvotes: 2

Related Questions