Murali
Murali

Reputation: 601

R How to compute percentages and append as new columns? (last two columns in example)

I have a huge dataset and I have to compute "Monthly Child Cost% and Monthly Parent cost%". I am new to R and tried my best. But not much luch. Please help.

In my original dataset, I have Prent/Child/Item/Month/Cost data. I have to compute 2 new columns...

Monthly child cost% = 100/(total Items cost in that particular month for that child) * Item cost

Example for 1st row: 100/100 * 70 = 70)

Monthly Parent cost % = 100/total Items cost in that particular month for that Parent) * Item cost

Example for first row: 100/345 * 215 (Total Milk cost for that parent) = 62.3

Please note: It is ok to have duplicate in Monthly_Parent_Cost%. I can get only distinct values by Parent and Item.

Parent  Child   Item    Month   Cost    Monthly_Child_Cost%     Monthly_Parent_Cost%
    1001    22  Milk    Jan     70      70      62.32
    1001    22  Bread   Jan     20      20      31.88
    1001    22  Eggs    Jan     10      10      5.8
    1001    22  Milk    Feb     60      60      62.32
    1001    22  Bread   Feb     40      40      31.88
    1001    11  Milk    Mar     40      40      62.32
    1001    11  Bread   Mar     50      50      31.88
    1001    11  Eggs    Mar     10      10      5.8
    1001    11  Milk    Apr     45      100     62.32
    1002    44  Milk    Jan     20      20      40.3
    1002    44  Bread   Jan     40      40      33.2
    1002    44  Eggs    Jan     40      40      26.3
    1002    44  Milk    Feb     34      34      40.3
    1002    44  Bread   Feb     66      66      33.2
    1002    55  Milk    Mar     20      20      40.3
    1002    55  Bread   Mar     20      20      33.2
    1002    55  Eggs    Mar     60      60      26.3
    1002    55  Milk    Apr     79      100     40.3

Upvotes: 1

Views: 62

Answers (1)

Ujjwal Kumar
Ujjwal Kumar

Reputation: 581

You can use aggregate function to aggregate cost values by Child + Month + Item and also Parent + Month + Item. After this, you can join the merge the results and add the resultant vector as a new one.

# Aggregate
childCosts <- aggregate(x = ds$Cost, by=list(ds$Child, ds$Month, ds$Item), FUN=sum)

# modify column names for easy merge
colnames(childCosts) <- c("Child", "Month", "Item", "Monthly_child_total")
ds2 <- merge(ds, childCosts)

# Compute desired result
ds2$Monthly_Child_Cost_Pct <- ds2$Cost*100/(ds2$Monthly_child_total)

P.S. my formulaes might not be correct as I am unclear about what do you want to aggreagte for the two columns. Adjust your code accordingly.

Upvotes: 1

Related Questions