Reputation: 57
I have a data.table which contains the following values
structure(list(Item = c("10040", "10040", "10110", "10190", "10190",
"12020", "12970", "7010040"), Customer = c("CLUB RETAILER - 106",
"WHOLESALER - 112", "NATIONAL RETAILER - 102", "MID WEST 1 - 120",
"WHOLESALER - 112", "WHOLESALER - 112", "WHOLESALER - 112", "DIAMLER BENZ - 108"
), DemandID = c("NetBaseForecast", "NetBaseForecast", "NetBaseForecast",
"NetBaseForecast", "NetBaseForecast", "NetBaseForecast", "NetBaseForecast",
"NetBaseForecast"), Forecasts = c(5, 158, 212, 12, 3, 3, 3, 2776
), RTF = c(1, 98, 139, 8, 0, 0, 0, 2356)), row.names = c(NA,
-8L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000000025c1ef0>, sorted = c("Item", "Customer"))
I need to create 2 new columns based on the forecast and rtf columns based on 2 conditions of each row for the Forecasts and RTF columns. I am using a data.table
I am currently using a for loop like this
for(i in 1:nrow(temp1NoOrders)){
if (temp1NoOrders$Forecasts[i] >= temp1NoOrders$RTF[i]){
temp1NoOrders$Com_F[i] <- temp1NoOrders$RTF[i]
temp1NoOrders$New_F[i] <- temp1NoOrders$Forecasts[i] - temp1NoOrders$RTF[i]
}
else if (temp1NoOrders$Forecasts[i] < temp1NoOrders$RTF[i]){
temp1NoOrders$Com_F[i] <- temp1NoOrders$Forecast[i]
temp1NoOrders$New_F[i] <- 0
}
}
Is there a faster way of doing this with data.table? I am assuming that if I use a 100k row dataset, this might become slower because of the loop and the if-else statements.
Upvotes: 1
Views: 59
Reputation: 388797
You don't need to do this line by line. Use fifelse
in data.table
to check for conditions.
library(data.table)
temp1NoOrders[, c("New_F","Com_F") :=
.(fifelse(Forecasts > RTF, Forecasts - RTF, 0),
fifelse(Forecasts > RTF, RTF, Forecasts))]
temp1NoOrders
# Item Customer DemandID Forecasts RTF New_F Com_F
#1: 10040 CLUB RETAILER - 106 NetBaseForecast 5 1 4 1
#2: 10040 WHOLESALER - 112 NetBaseForecast 158 98 60 98
#3: 10110 NATIONAL RETAILER - 102 NetBaseForecast 212 139 73 139
#4: 10190 MID WEST 1 - 120 NetBaseForecast 12 8 4 8
#5: 10190 WHOLESALER - 112 NetBaseForecast 3 0 3 0
#6: 12020 WHOLESALER - 112 NetBaseForecast 3 0 3 0
#7: 12970 WHOLESALER - 112 NetBaseForecast 3 0 3 0
#8: 7010040 DIAMLER BENZ - 108 NetBaseForecast 2776 2356 420 2356
fifelse
is easy to understand but more efficient approach would be to use pmax
and pmin
:
temp1NoOrders[, c("New_F","Com_F") := .(pmax(Forecasts - RTF, 0),
pmin(Forecasts, RTF))]
Upvotes: 5