Reputation: 35
I am looking to multiply data across two different dataframes. The input is dynamic so the number of IDs per team is not constant. I've included a sample set of data below.
Branch ID-1 Time-1 ID-2 Time-2 ID-3 Time-3
Texas BKP 5.5 LMG 2.8 DDP 8.9
Maine BQQ 11 BKP 8.1 OLW 3.0
NYork DDP 2.0 ADD 6.5 BQQ 0.4
There are hundreds of branches that could utilize thousands of IDs, so the below is just a small subset I've created. We also receive a cost dataframe for all IDs similar to below:
ID Cost
ADD 4.50
BKP 11.99
BQQ 1.50
DDP 8.99
LMG 24.99
OLW 29.99
I am trying to find the cost by ID per branch and I'm able to go by Substrings, but joining the second dataframe is where I'm running into trouble. The output that I need would look like this:
Branch ID-1 Time-1 ID-2 Time-2 ID-3 Time-3 Cost-1 Cost-2 Cost-3
Texas BKP 5.5 LMG 2.8 DDP 8.9 65.945 69.972 80.011
Maine BQQ 11 BKP 8.1 OLW 3.0 16.50 97.119 89.97
NYork DDP 2.0 ADD 6.5 BQQ 0.4 17.98 29.25 0.6
I know this isn't the prettiest output, but unfortunately it's the output needed. I'm very grateful for any help you are able to provide with this.
Upvotes: 1
Views: 117
Reputation: 83215
A possible solution with data.table:
library(data.table)
melt(setDT(df1), id = 1,
measure.vars = patterns(c("^ID","^Time")),
value.name = c("ID","Time")
)[df2, on = .(ID), Cost := Time * i.Cost
][, dcast(.SD, Branch ~ variable, value.var = c("ID","Time","Cost"))]
which gives:
Branch ID_1 ID_2 ID_3 Time_1 Time_2 Time_3 Cost_1 Cost_2 Cost_3 1: Maine BQQ BKP OLW 11.0 8.1 3.0 16.500 97.119 89.970 2: NYork DDP ADD BQQ 2.0 6.5 0.4 17.980 29.250 0.600 3: Texas BKP LMG DDP 5.5 2.8 8.9 65.945 69.972 80.011
Used data:
df1 <- structure(list(Branch = c("Texas", "Maine", "NYork"), ID.1 = c("BKP", "BQQ", "DDP"), Time.1 = c(5.5, 11, 2),
ID.2 = c("LMG", "BKP", "ADD"), Time.2 = c(2.8, 8.1, 6.5), ID.3 = c("DDP", "OLW", "BQQ"), Time.3 = c(8.9, 3, 0.4)),
.Names = c("Branch", "ID.1", "Time.1", "ID.2", "Time.2", "ID.3", "Time.3"), class = "data.frame", row.names = c(NA, -3L))
df2 <- structure(list(ID = c("ADD", "BKP", "BQQ", "DDP", "LMG", "OLW"), Cost = c(4.5, 11.99, 1.5, 8.99, 24.99, 29.99)),
.Names = c("ID", "Cost"), class = "data.frame", row.names = c(NA, -6L))
Upvotes: 2
Reputation: 79188
using Base R:
dat3 = merge(reshape(dat1,matrix(2:ncol(dat1),2),idv=1,dir="long",ids=dat1$Branch),dat2,by.x="ID.1",by.y="ID")
reshape(transform(dat3,Cost=Cost * Time.1)[order(dat3$time),],idvar = "Branch",dir="wide")
Branch ID.1.1 Time.1.1 Cost.1 ID.1.2 Time.1.2 Cost.2 ID.1.3 Time.1.3 Cost.3
2 Texas BKP 5.5 65.945 LMG 2.8 69.972 DDP 8.9 80.011
4 Maine BQQ 11.0 16.500 BKP 8.1 97.119 OLW 3.0 89.970
6 NYork DDP 2.0 17.980 ADD 6.5 29.250 BQQ 0.4 0.600
Upvotes: 0