JaapieS
JaapieS

Reputation: 45

Matching intervals with values in another table in R

Currently, I have one df and a price table.

Order Number  wgt           wgt_intvl        price
-------------------         ---------------  -----
 01            22           0-15             50
 02            5            15-25            75
 03            35           25-50            135

What I'd like is to match the weight from the df into an interval of the price table in R. For example, the first order (Order Number 01) corresponds with a price of 75. Therefore, I want to add a column in the first df, say df$cost that corresponds with the appropriate price according to wgt_intvl in the price table.

The way I see to do it is with an if-else statement, but this is highly inefficient and I was wondering if there is a better way to do it. In reality these tables are much longer - there is no logical "buildup" in price or weight interval. I have 15 weight intervals in this table. My current solution would look like this:

If(wgt < 15){
  df$cost <- 50
} else if (wgt > 15 & wgt < 25){ 
  df$cost <- 75
} else if(wgt > 25 & wgt < 50){ 
  df$cost <- 135
} 

This times fifteen, using the corresponding prices of the price table. I'd love a more efficient solution. Thanks in advance!

Upvotes: 1

Views: 793

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269451

Using the data shown reproducibly in the Note at the end, form the vector of cutpoints (i.e. the first number in each interval) and then use findInterval to find the interval corresponding to the weight.

cutpoints <- as.numeric(sub("-.*", "", dfprice$wgt_intvl))
transform(dfmain, price = dfprice$price[findInterval(wgt, cutpoints)])

giving:

  Order wgt price
1    01  22    75
2    02   5    50
3    03  35   135
4    04  25   135

Note

dfmain <- data.frame(Order = c("01", "02", "03", "04"), wgt = c(22, 5, 35, 25), 
 stringsAsFactors = FALSE)

dfprice <- data.frame(wgt_intvl = c("0-15", "15-25", "25-50"), 
 price = c(50, 75, 135), stringsAsFactors = FALSE)

Upvotes: 1

Fnguyen
Fnguyen

Reputation: 1177

Instead of an if-statement you could use a more efficient case_when operation:

library(dplyr)
 df %>%
mutate(cost = case_when(
    wgt < 15 ~ 50,
    wgt > 15 & wgt <25 ~ 75,
    TRUE ~ 135))

Alternatively you could use cut() to transform wgt to wgt_intvl and match via left_join().

Upvotes: 0

Related Questions