Reputation: 601
I have a huge dataset and I have to compute "Monthly Child Cost% and Monthly Parent cost%". I am new to R and tried my best. But not much luch. Please help.
In my original dataset, I have Prent/Child/Item/Month/Cost data. I have to compute 2 new columns...
Monthly child cost% = 100/(total Items cost in that particular month for that child) * Item cost
Example for 1st row: 100/100 * 70 = 70)
Monthly Parent cost % = 100/total Items cost in that particular month for that Parent) * Item cost
Example for first row: 100/345 * 215 (Total Milk cost for that parent) = 62.3
Please note: It is ok to have duplicate in Monthly_Parent_Cost%. I can get only distinct values by Parent and Item.
Parent Child Item Month Cost Monthly_Child_Cost% Monthly_Parent_Cost%
1001 22 Milk Jan 70 70 62.32
1001 22 Bread Jan 20 20 31.88
1001 22 Eggs Jan 10 10 5.8
1001 22 Milk Feb 60 60 62.32
1001 22 Bread Feb 40 40 31.88
1001 11 Milk Mar 40 40 62.32
1001 11 Bread Mar 50 50 31.88
1001 11 Eggs Mar 10 10 5.8
1001 11 Milk Apr 45 100 62.32
1002 44 Milk Jan 20 20 40.3
1002 44 Bread Jan 40 40 33.2
1002 44 Eggs Jan 40 40 26.3
1002 44 Milk Feb 34 34 40.3
1002 44 Bread Feb 66 66 33.2
1002 55 Milk Mar 20 20 40.3
1002 55 Bread Mar 20 20 33.2
1002 55 Eggs Mar 60 60 26.3
1002 55 Milk Apr 79 100 40.3
Upvotes: 1
Views: 62
Reputation: 581
You can use aggregate
function to aggregate cost values by Child + Month + Item
and also Parent + Month + Item
. After this, you can join the merge the results and add the resultant vector as a new one.
# Aggregate
childCosts <- aggregate(x = ds$Cost, by=list(ds$Child, ds$Month, ds$Item), FUN=sum)
# modify column names for easy merge
colnames(childCosts) <- c("Child", "Month", "Item", "Monthly_child_total")
ds2 <- merge(ds, childCosts)
# Compute desired result
ds2$Monthly_Child_Cost_Pct <- ds2$Cost*100/(ds2$Monthly_child_total)
P.S. my formulaes might not be correct as I am unclear about what do you want to aggreagte for the two columns. Adjust your code accordingly.
Upvotes: 1