btaek
btaek

Reputation: 35

Multiply across dataframes by column values

I am looking to multiply data across two different dataframes. The input is dynamic so the number of IDs per team is not constant. I've included a sample set of data below.

Branch  ID-1  Time-1  ID-2  Time-2  ID-3  Time-3
Texas   BKP   5.5     LMG   2.8     DDP   8.9
Maine   BQQ   11      BKP   8.1     OLW   3.0
NYork   DDP   2.0     ADD   6.5     BQQ   0.4

There are hundreds of branches that could utilize thousands of IDs, so the below is just a small subset I've created. We also receive a cost dataframe for all IDs similar to below:

ID     Cost
ADD    4.50
BKP    11.99
BQQ    1.50
DDP    8.99
LMG    24.99
OLW    29.99

I am trying to find the cost by ID per branch and I'm able to go by Substrings, but joining the second dataframe is where I'm running into trouble. The output that I need would look like this:

Branch  ID-1  Time-1  ID-2  Time-2  ID-3  Time-3  Cost-1  Cost-2  Cost-3
Texas   BKP   5.5     LMG   2.8     DDP   8.9     65.945  69.972  80.011
Maine   BQQ   11      BKP   8.1     OLW   3.0     16.50   97.119  89.97
NYork   DDP   2.0     ADD   6.5     BQQ   0.4     17.98   29.25   0.6

I know this isn't the prettiest output, but unfortunately it's the output needed. I'm very grateful for any help you are able to provide with this.

Upvotes: 1

Views: 117

Answers (2)

Jaap
Jaap

Reputation: 83215

A possible solution with :

library(data.table)

melt(setDT(df1), id = 1,
     measure.vars = patterns(c("^ID","^Time")),
     value.name = c("ID","Time")
     )[df2, on = .(ID), Cost := Time * i.Cost
       ][, dcast(.SD, Branch ~ variable, value.var = c("ID","Time","Cost"))]

which gives:

   Branch ID_1 ID_2 ID_3 Time_1 Time_2 Time_3 Cost_1 Cost_2 Cost_3
1:  Maine  BQQ  BKP  OLW   11.0    8.1    3.0 16.500 97.119 89.970
2:  NYork  DDP  ADD  BQQ    2.0    6.5    0.4 17.980 29.250  0.600
3:  Texas  BKP  LMG  DDP    5.5    2.8    8.9 65.945 69.972 80.011

Used data:

df1 <- structure(list(Branch = c("Texas", "Maine", "NYork"), ID.1 = c("BKP", "BQQ", "DDP"), Time.1 = c(5.5, 11, 2), 
                      ID.2 = c("LMG", "BKP", "ADD"), Time.2 = c(2.8, 8.1, 6.5), ID.3 = c("DDP", "OLW", "BQQ"), Time.3 = c(8.9, 3, 0.4)), 
                 .Names = c("Branch", "ID.1", "Time.1", "ID.2", "Time.2", "ID.3", "Time.3"), class = "data.frame", row.names = c(NA, -3L))
df2 <- structure(list(ID = c("ADD", "BKP", "BQQ", "DDP", "LMG", "OLW"), Cost = c(4.5, 11.99, 1.5, 8.99, 24.99, 29.99)),
                 .Names = c("ID", "Cost"), class = "data.frame", row.names = c(NA, -6L))

Upvotes: 2

Onyambu
Onyambu

Reputation: 79188

using Base R:

dat3 = merge(reshape(dat1,matrix(2:ncol(dat1),2),idv=1,dir="long",ids=dat1$Branch),dat2,by.x="ID.1",by.y="ID")

 reshape(transform(dat3,Cost=Cost * Time.1)[order(dat3$time),],idvar = "Branch",dir="wide")

  Branch ID.1.1 Time.1.1 Cost.1 ID.1.2 Time.1.2 Cost.2 ID.1.3 Time.1.3 Cost.3
2  Texas    BKP      5.5 65.945    LMG      2.8 69.972    DDP      8.9 80.011
4  Maine    BQQ     11.0 16.500    BKP      8.1 97.119    OLW      3.0 89.970
6  NYork    DDP      2.0 17.980    ADD      6.5 29.250    BQQ      0.4  0.600

Upvotes: 0

Related Questions