Lin
Lin

Reputation: 93

Conditional calculation based on values in other column

Suppose I have a data.frame, I wish to create a new column called duration, it is calculated only for records where status = Active, using 2016-12-10 as today's date, so that duration = today - start_date.

What's the best approach for this conditional calculation?

status <- c("Active", "Inactive", "Active")    
date <- c("2016-10-25", "2015-05-11", "2015-3-18")    
start_date <- as.Date(date, format = "%Y-%m-%d")    
data.frame(status, start_date)

Upvotes: 0

Views: 1691

Answers (2)

Aramis7d
Aramis7d

Reputation: 2496

using dplyr, you can try:

dft %>% 
  dplyr::mutate(duration = ifelse(status == "Active", (today - start_date), NA))

where dft is your initial dataframe.

Upvotes: 0

akrun
akrun

Reputation: 886938

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), create the logical index in 'i' and assign (:=) the difference between 'today' and 'start_date' as the 'duration' column. This will be efficient as it assigns in place

library(data.table)
setDT(df1)[status == "Active", duration := today - start_date]
df1
#     status start_date duration
#1:   Active 2016-10-25  46 days
#2: Inactive 2015-05-11  NA days
#3:   Active 2015-03-18 633 days

Or a base R option is

i1 <- df1$status == "Active"
df1[i1, "duration"] <- today - df1$start_date[i1]

where

today <- as.Date("2016-12-10")

Upvotes: 3

Related Questions