Reputation: 353
I have the following dataset with information on the sales figures for two companies over a period of five years. I want to use the first year's figure as a baseline and calculate the percentage change in sales for each subsequent year for each company. I use the following:
transform(dataset, SalesD = unlist(aggregate(Sales ~ Company, function(x) ((x - x[1]) / x[1]) * 100, data=Dataset)$Sales))
yet I do not get the correct values for the second company (I expect the value at row 6 to be zero, as this is the base year). Here are the results:
+----+---------+------+--------+--------+
| ID | Company | Year | Sales | SalesD |
+----+---------+------+--------+--------+
| 1 | LSL | 2015 | 100000 | 0 |
| 2 | LSL | 2016 | 120000 | 20 |
| 3 | LSL | 2017 | 150000 | 50 |
| 4 | LSL | 2018 | 100000 | 0 |
| 5 | LSL | 2019 | 50000 | -50 |
| 6 | IDA | 2015 | 150000 | 50 |
| 7 | IDA | 2016 | 180000 | 80 |
| 8 | IDA | 2017 | 200000 | 100 |
| 9 | IDA | 2018 | 180000 | 80 |
| 10 | IDA | 2019 | 160000 | 60 |
+----+---------+------+--------+--------+
Could you help me point out what is wrong in the code? Many thanks!
Upvotes: 2
Views: 331
Reputation: 887183
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by "Company", we get the percentage change by subtracting the "Sales" from the value of "Sales" that corresponds to min
value of "Year", multiply by 100, round
and assign (:=
) to create "SalesD".
library(data.table)
setDT(df1)[, SalesD := round(100*(Sales-
Sales[which.min(Year)])/Sales[which.min(Year)]) , Company]
df1
# ID Company Year Sales SalesD
# 1: 1 LSL 2015 100000 0
# 2: 2 LSL 2016 120000 20
# 3: 3 LSL 2017 150000 50
# 4: 4 LSL 2018 100000 0
# 5: 5 LSL 2019 50000 -50
# 6: 6 IDA 2015 150000 0
# 7: 7 IDA 2016 180000 20
# 8: 8 IDA 2017 200000 33
# 9: 9 IDA 2018 180000 20
#10: 10 IDA 2019 160000 7
Upvotes: 1