slauw
slauw

Reputation: 11

R - Create a new date column based on current date column if it matches with other column

I have a dataframe like below on users campaign with below details:

Columns are: email_address, response_date, campaign_name, state, suburb, postcode, magazine_subs_title, response_type

 email_address  response_date   campaign_name   state   suburb  postcode    magazine_subs_title response_type
[email protected] 1/02/2018 18:01   2018_Beauty_Acq NSW Sydney  2000    AGL opened
[email protected] 03/10/2017  14:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI delivered
[email protected] 03/10/2017  17:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI opened
[email protected] 25/01/2018 9:00 2018_Beauty_Acq   NSW Sydney  2000    AGL delivered
[email protected] 14/07/2017 11:00  2017_Fashion_Show   NSW Sydney  2000    AGL delivered

From here, I want to extract the response_date where response_type = 'delivered' and specific for each campaigns and end up with below table:

email_address   response_date   campaign_name   state   suburb  postcode    magazine_subs_title response_type   delivered_date
[email protected]   1/02/2018 18:01 2018_Beauty_Acq NSW Sydney  2000    AGL opened  25/01/2018 9:00
[email protected]  03/10/2017  14:00:00 PM 2017_Fashion_Show   QLD Brisbane    4000    MHI delivered   03/10/2017  14:00:00 PM
[email protected]  03/10/2017  17:00:00 PM 2017_Fashion_Show   QLD Brisbane    4000    MHI opened  03/10/2017  14:00:00 PM
[email protected]   25/01/2018 9:00 2018_Beauty_Acq NSW Sydney  2000    AGL delivered   25/01/2018 9:00
[email protected]   14/07/2017 11:00    2017_Fashion_Show   NSW Sydney  2000    AGL delivered   14/07/2017 11:00

Does that makes sense?

Any one know how to perform that sort of operation in R? Thank you

Upvotes: 0

Views: 250

Answers (1)

MKR
MKR

Reputation: 20095

One approach could be using lubridate, tidyr and dplyr .

The approach is to first prepare data. Read response_date and Time separately and then unite those to column to make response_date. Then use parse_date_time to convert those two columns in to datetime format which is optional(as OP is not taking any decision on this date). Finally apply ifelse to populate delivered_date.

#Data

df <- read.table(text = "
email_address  response_date Time  campaign_name   state   suburb  postcode    magazine_subs_title response_type
[email protected] 1/02/2018 18:01   2018_Beauty_Acq NSW Sydney  2000    AGL opened
[email protected] 03/10/2017  14:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI delivered
[email protected] 03/10/2017  17:00:00 2017_Fashion_Show   QLD Brisbane    4000    MHI opened
[email protected] 25/01/2018 9:00 2018_Beauty_Acq   NSW Sydney  2000    AGL delivered
[email protected] 14/07/2017 11:00  2017_Fashion_Show   NSW Sydney  2000    AGL delivered", header=T, stringsAsFactor = F)

library(lubridate)
library(dplyr)
library(tidyr)

df %>%
   unite("response_date", c("response_date", "Time"), sep= " ") %>%
   mutate(response_date = parse_date_time(response_date, c("dmy HMS", "dmy HM"))) %>%
   mutate(delivered_date = ifelse(grepl("delivered",response_type), as.character(response_date), NA)) %>%
  group_by(campaign_name, state, suburb, postcode) %>% 
  fill(delivered_date) %>% ungroup() %>% 
  as.data.frame()    
Result:
   email_address       response_date     campaign_name state   suburb postcode magazine_subs_title response_type      delivered_date
#1  [email protected] 2017-07-14 11:00:00 2017_Fashion_Show   NSW   Sydney     2000                 AGL     delivered 2017-07-14 11:00:00
#2 [email protected] 2017-10-03 14:00:00 2017_Fashion_Show   QLD Brisbane     4000                 MHI     delivered 2017-10-03 14:00:00
#3 [email protected] 2017-10-03 17:00:00 2017_Fashion_Show   QLD Brisbane     4000                 MHI        opened 2017-10-03 14:00:00
#4  [email protected] 2018-02-01 18:01:00   2018_Beauty_Acq   NSW   Sydney     2000                 AGL        opened                <NA>
#5  [email protected] 2018-01-25 09:00:00   2018_Beauty_Acq   NSW   Sydney     2000                 AGL     delivered 2018-01-25 09:00:00

Upvotes: 1

Related Questions