Reputation: 607
I need to calculate the number of days elapsed between multiple dates in two ways and then output those results to new columns: i) number of days that has elapsed as compared to the first date (e.g., RESULTS$FIRST) and ii) between sequential dates (e.g., RESULTS$BETWEEN). Here is an example with the desired results. Thanks in advance.
library(lubridate)
DATA = data.frame(DATE = mdy(c("7/8/2013", "8/1/2013", "8/30/2013", "10/23/2013",
"12/16/2013", "12/16/2015")))
RESULTS = data.frame(DATE = mdy(c("7/8/2013", "8/1/2013", "8/30/2013", "10/23/2013",
"12/16/2013", "12/16/2015")),
FIRST = c(0, 24, 53, 107, 161, 891), BETWEEN = c(0, 24, 29, 54, 54, 730))
Upvotes: 9
Views: 28604
Reputation: 15458
#Using dplyr package
library(dplyr)
df1 %>% # your dataframe
mutate(BETWEEN0=as.numeric(difftime(DATE,lag(DATE,1))),BETWEEN=ifelse(is.na(BETWEEN0),0,BETWEEN0),FIRST=cumsum(as.numeric(BETWEEN)))%>%
select(-BETWEEN0)
DATE BETWEEN FIRST
1 2013-07-08 0 0
2 2013-08-01 24 24
3 2013-08-30 29 53
4 2013-10-23 54 107
5 2013-12-16 54 161
6 2015-12-16 730 891
Upvotes: 17
Reputation: 19950
You can just add each column with the simple difftime
and lagged diff
calculations.
DATA$FIRST <- c(0,
with(DATA,
difftime(DATE[2:length(DATE)],DATE[1], unit="days")
)
)
DATA$BETWEEN <- c(0,
with(DATA,
diff(DATE[1:(length(DATE) - 1)], unit="days")
)
)
identical(DATA, RESULTS)
[1] TRUE
Upvotes: 0
Reputation: 5586
This will get you what you want:
d <- as.Date(DATA$DATE, format="%m/%d/%Y")
first <- c()
for (i in seq_along(d))
first[i] <- d[i] - d[1]
between <- c(0, diff(d))
This uses the as.Date()
function in the base package to cast the vector of string dates to date values using the given format. Since you have dates as month/day/year, you specify format="%m/%d/%Y"
to make sure it's interpreted correctly.
diff()
is the lagged difference. Since it's lagged, it doesn't include the difference between element 1 and itself, so you can concatenate a 0.
Differences between Date
objects are given in days by default.
Then constructing the output dataframe is simple:
RESULTS <- data.frame(DATE=DATA$DATE, FIRST=first, BETWEEN=between)
Upvotes: 2
Reputation: 5169
For the first part:
DATA = data.frame((c("7/8/2013", "8/1/2013", "8/30/2013", "10/23/2013","12/16/2013", "12/16/2015")))
names(DATA)[1] = "V1"
date = as.Date(DATA$V1, format="%m/%d/%Y")
print(date-date[1])
Result:
[1] 0 24 53 107 161 891
For second part - simply use a for
loop
Upvotes: 0