Reputation: 73
I have the following data frame df
CustID Mode_Payment Payment Expiry Amount
100 ECS 2015-01-01 2015-03-01 1000
200 Online 2015-01-01 2015-05-01 2000
100 ECS 2015-01-01 2015-10-01 3000
300 Cash 2015-01-01 2015-05-01 5000
I want to calculate a new field subscription period which is period<-as.numberic(expiry-payment).
But, when the Mode of payment is ECS then period should be calculated by the following formula :
group_by(CustID)
period<-max(expiry)-min(payment)
ugroup()
So for the above data set output should be
CustID Mode_Payment Payment Expiry Amount Period
100 ECS 2015-01-01 2015-03-01 1000 273
200 Online 2015-01-01 2015-05-01 2000 120
100 ECS 2015-01-01 2015-10-01 3000 273
300 Cash 2015-01-01 2015-05-01 5000 120
Unfortunately, I'm getting all kind of errors.
df<-df %>%
group_by(custid) %>%
if(mode_payement=='ECS') {mutate(period=(as.numeric(max(expiry)-min(payement))))
} else mutate(period=as.numeric((expiry-payment))) %>%
ungroup()
Upvotes: 0
Views: 165
Reputation: 4024
Or dplyr with ifelse:
data %>%
group_by(CustID) %>%
mutate_each(funs(as.Date), Expiry, Payment) %>%
mutate(period =
(Mode_Payment == "ECS") %>%
ifelse(
max(Expiry) - min(Payment),
Expiry - Payment) )
Upvotes: 0
Reputation: 23574
I modified your data a bit in case you have ECS and something else for a customer ID. I chose to use subsetting rather an using ifelse in my approach. You have one operation for data with ECS only and the other for the rest.
DATA & CODE
mydf <- read.table(text = "CustID Mode_Payment Payment Expiry Amount
100 ECS 2015-01-01 2015-03-01 1000
200 Online 2015-01-01 2015-05-01 2000
100 ECS 2015-01-01 2015-10-01 3000
300 Cash 2015-01-01 2015-05-01 5000
100 Online 2015-01-01 2015-07-01 7000", header = T, stringsAsFactors = FALSE)
CustID Mode_Payment Payment Expiry Amount
1 100 ECS 2015-01-01 2015-03-01 1000
2 200 Online 2015-01-01 2015-05-01 2000
3 100 ECS 2015-01-01 2015-10-01 3000
4 300 Cash 2015-01-01 2015-05-01 5000
5 100 Online 2015-01-01 2015-07-01 7000
library(dplyr)
library(data.table)
#Set Payment and Expiry as Date.
setDT(mydf)[, c("Payment", "Expiry") := lapply(.SD, as.IDate), .SDcols = 3:4]
x <- mydf[Mode_Payment == "ECS"][, period := max(Expiry) - min(Payment), by = CustID]
y <- mydf[Mode_Payment != "ECS"][, period := Expiry - Payment, by = CustID]
rbindlist(list(x, y))
# CustID Mode_Payment Payment Expiry Amount period
#1: 100 ECS 2015-01-01 2015-03-01 1000 273 days
#2: 100 ECS 2015-01-01 2015-10-01 3000 273 days
#3: 200 Online 2015-01-01 2015-05-01 2000 120 days
#4: 300 Cash 2015-01-01 2015-05-01 5000 120 days
#5: 100 Online 2015-01-01 2015-07-01 7000 181 days
### dplyr way
filter(mydf, Mode_Payment == "ECS") %>%
group_by(CustID) %>%
mutate(period = max(Expiry) - min(Payment)) -> x
filter(mydf, Mode_Payment != "ECS") %>%
mutate(period = Expiry - Payment) -> y
bind_rows(x, y)
Upvotes: 1