Reputation: 1637
I have a large dataframe containing IDs and a start date of intervention for each ID:
ID Date
1 1 17228
2 2 17226
3 3 17230
And I would like to add 2 rows to each ID with subsequent dates as the values in those rows:
ID Date
1 1 17228
2 1 17229
3 1 17230
4 2 17226
5 2 17227
6 2 17228
7 3 17230
8 3 17231
9 3 17232
Is there any way using dplyr if possible? Other ways are also fine!
Upvotes: 3
Views: 659
Reputation: 887118
We expand the data by uncount
ing, then grouped by 'ID', get the seq
uence from the first
'Date' to the number of rows (n()
) while incrementing by
1
library(tidyverse)
df1 %>%
uncount(3) %>%
group_by(ID) %>%
mutate(Date = seq(Date[1], length.out = n(), by = 1))
# A tibble: 9 x 2
# Groups: ID [3]
# ID Date
# <int> <dbl>
#1 1 17228
#2 1 17229
#3 1 17230
#4 2 17226
#5 2 17227
#6 2 17228
#7 3 17230
#8 3 17231
#9 3 17232
Or another option is unnest
a list
column
df1 %>%
group_by(ID) %>%
mutate(Date = list(Date[1] + 0:2)) %>%
unnest
Or with complete
df1 %>%
group_by(ID) %>%
complete(Date = first(Date) + 0:2)
Or using base R
(pasteing from the comments)
within(df1[rep(seq_len(nrow(df1)), each = 3),], Date <- Date + 0:2)
Or more compactly in data.table
library(data.table)
setDT(df1)[, .(Date = Date + 0:2), ID]
Upvotes: 4
Reputation: 388982
Using dplyr
, we can repeat every row 3 times, group_by
ID
and increment every date from 0 to n() - 1
for each ID
.
library(dplyr)
df %>%
slice(rep(seq_len(n()), each = 3)) %>%
group_by(ID) %>%
mutate(Date = Date + 0: (n() - 1))
# ID Date
# <int> <int>
#1 1 17228
#2 1 17229
#3 1 17230
#4 2 17226
#5 2 17227
#6 2 17228
#7 3 17230
#8 3 17231
#9 3 17232
A base R one-liner using the same logic above would be
transform(df[rep(seq_len(nrow(df)), each = 3),], Date = Date + 0:2)
Upvotes: 0
Reputation: 32548
do.call(rbind, lapply(split(d, d$ID), function(x){
rbind(x, data.frame(ID = rep(tail(x$ID, 1), 2),
Date = tail(x$Date, 1) + 1:2))
}))
# ID Date
#1.1 1 17228
#1.11 1 17229
#1.2 1 17230
#2.2 2 17226
#2.1 2 17227
#2.21 2 17228
#3.3 3 17230
#3.1 3 17231
#3.2 3 17232
Data
d = structure(list(ID = 1:3, Date = c(17228L, 17226L, 17230L)),
class = "data.frame",
row.names = c("1", "2", "3"))
Upvotes: 1