Reputation: 463
I am currently working in R in data table and have two date columns (first_day and last_day) that I need to merge together into one date column (date). This needs to be done so that the column has the dates in order from smallest to largest by group (id).
Here is the data I'm working with:
ID first_day last_day
1 1/12/2005 1/15/2005
2 2/15/2006 2/19/2006
2 3/8/2006 3/12/2006
3 1/9/2008 1/13/2008
Here is what I'm trying to get the result to look like:
ID first_day last_day Date
1 1/12/2005 1/15/2005 1/12/2005
1 1/12/2005 1/15/2005 1/15/2005
2 2/15/2006 2/19/2006 2/15/2006
2 2/15/2006 2/19/2006 2/19/2006
2 3/8/2006 3/12/2006 3/8/2006
2 3/8/2006 3/12/2006 3/12/2006
3 1/9/2008 1/13/2008 1/9/2008
3 1/9/2008 1/13/2008 1/13/2008
Any assistance is greatly appreciated!
Upvotes: 1
Views: 832
Reputation: 887851
We can group by row, concatenate the 'first_day', 'last_day' into a list
column and do unnest
library(dplyr)
library(purrr)
library(tidyr)
df1 %>%
rowwise() %>%
mutate(Date = list(c(first_day, last_day))) %>%
unnest(Date)
Or with map2
, create the list
column and then do the unnest
df1 %>%
mutate(Date = map2(first_day, last_day, c)) %>%
unnest(Date)
# A tibble: 8 x 4
# ID first_day last_day Date
# <int> <chr> <chr> <chr>
#1 1 1/12/2005 1/15/2005 1/12/2005
#2 1 1/12/2005 1/15/2005 1/15/2005
#3 2 2/15/2006 2/19/2006 2/15/2006
#4 2 2/15/2006 2/19/2006 2/19/2006
#5 2 3/8/2006 3/12/2006 3/8/2006
#6 2 3/8/2006 3/12/2006 3/12/2006
#7 3 1/9/2008 1/13/2008 1/9/2008
#8 3 1/9/2008 1/13/2008 1/13/2008
Or using uncount
uncount(df1, 2) %>%
mutate(Date = c(t(df1[-1])))
df1 <- structure(list(ID = c(1L, 2L, 2L, 3L), first_day = c("1/12/2005",
"2/15/2006", "3/8/2006", "1/9/2008"), last_day = c("1/15/2005",
"2/19/2006", "3/12/2006", "1/13/2008")), class = "data.frame",
row.names = c(NA,
-4L))
Upvotes: 3
Reputation: 102625
Here is another base R solution:
dfout <- Reduce(rbind,
lapply(split(df,seq(nrow(df))),
function(x) cbind(x,Date = unlist(x[-1]),row.names = NULL)))
or a faster way using do.call()
dfout <- do.call(rbind,
c(make.row.names = FALSE,
lapply(split(df,seq(nrow(df))),
function(x) cbind(x,Date = unlist(x[-1]),row.names = NULL))))
such that
> dfout
ID first_day last_day Date
1 1 1/12/2005 1/15/2005 1/12/2005
2 1 1/12/2005 1/15/2005 1/15/2005
3 2 2/15/2006 2/19/2006 2/15/2006
4 2 2/15/2006 2/19/2006 2/19/2006
5 2 3/8/2006 3/12/2006 3/8/2006
6 2 3/8/2006 3/12/2006 3/12/2006
7 3 1/9/2008 1/13/2008 1/9/2008
8 3 1/9/2008 1/13/2008 1/13/2008
Upvotes: 2
Reputation: 28705
out <- df[rep(1:nrow(df), each = 2),] # repeat each row
out$Date <- with(df, c(rbind(first_day, last_day))) # interlace two columns
# or out$Date <- with(out, ifelse(seq_along(first_day) %% 2, first_day, last_day))
out
# ID first_day last_day Date
# 1 1 1/12/2005 1/15/2005 1/12/2005
# 1.1 1 1/12/2005 1/15/2005 1/15/2005
# 2 2 2/15/2006 2/19/2006 2/15/2006
# 2.1 2 2/15/2006 2/19/2006 2/19/2006
# 3 2 3/8/2006 3/12/2006 3/8/2006
# 3.1 2 3/8/2006 3/12/2006 3/12/2006
# 4 3 1/9/2008 1/13/2008 1/9/2008
# 4.1 3 1/9/2008 1/13/2008 1/13/2008
or
library(data.table)
setDT(df)
df[, .(Date = c(first_day, last_day)), by = .(ID, first_day, last_day)]
# ID first_day last_day Date
# 1: 1 1/12/2005 1/15/2005 1/12/2005
# 2: 1 1/12/2005 1/15/2005 1/15/2005
# 3: 2 2/15/2006 2/19/2006 2/15/2006
# 4: 2 2/15/2006 2/19/2006 2/19/2006
# 5: 2 3/8/2006 3/12/2006 3/8/2006
# 6: 2 3/8/2006 3/12/2006 3/12/2006
# 7: 3 1/9/2008 1/13/2008 1/9/2008
# 8: 3 1/9/2008 1/13/2008 1/13/2008
Upvotes: 6