Reputation: 83
I'm having troubles to aggregate differents shops by date and SKU.
My sample data is like this:
Date SKU Quantity Price
2010/01/01 52144 1 4.00
2010/01/01 87548 5 0.50
2010/01/01 47852 3 1.99
2010/01/01 52144 4 3.80
2010/01/02 87548 1 0.55
And I'm trying to use the function agreggate to sty like this:
Date SKU Quantity Price
2010/01/01 52144 5 3.80
2010/01/01 87548 5 0.50
2010/01/01 47852 3 1.99
2010/01/02 87548 1 0.55
Thanks!
Upvotes: 0
Views: 243
Reputation: 99321
You can do this easily with dplyr
library(dplyr)
group_by(df, Date, SKU) %>%
summarize(Quantity = sum(Quantity), Price = min(Price))
# Date SKU Quantity Price
# 1 2010/01/01 47852 3 1.99
# 2 2010/01/01 52144 5 3.80
# 3 2010/01/01 87548 5 0.50
# 4 2010/01/02 87548 1 0.55
And also with data.table
library(data.table)
setDT(df)[, .(Quantity = sum(Quantity), Price = min(Price)), by = .(Date, SKU)]
# Date SKU Quantity Price
# 1: 2010/01/01 52144 5 3.80
# 2: 2010/01/01 87548 5 0.50
# 3: 2010/01/01 47852 3 1.99
# 4: 2010/01/02 87548 1 0.55
Upvotes: 8
Reputation: 20811
cough base cough
dat <- read.table(header = TRUE, text = "Date SKU Quantity Price
2010/01/01 52144 1 4.00
2010/01/01 87548 5 0.50
2010/01/01 47852 3 1.99
2010/01/01 52144 4 3.80
2010/01/02 87548 1 0.55")
tmp <- within(dat, {
sums <- ave(Quantity, list(Date, SKU), FUN = sum)
mins <- ave(Price, list(Date, SKU), FUN = min)
})
tmp[!with(tmp, duplicated(cbind(Date, SKU))), ]
# Date SKU Quantity Price mins sums
# 1 2010/01/01 52144 1 4.00 3.80 5
# 2 2010/01/01 87548 5 0.50 0.50 5
# 3 2010/01/01 47852 3 1.99 1.99 3
# 5 2010/01/02 87548 1 0.55 0.55 1
Upvotes: 4
Reputation: 21047
When dealing with this kind of things, I've found that the sqldf
package is a great tool. Let's say your data is stored in a data frame called df
. You can use a SELECT
instruction to get what you need:
sqldf("select Date, SKU, sum(a.Quantity) as Quantity, min(a.Price) as price
from df as a
group by Date, SKU")
You can use standard SQL instructions to manipulate, filter or aggregate data stored in data frames (which sqldf
reads as if they were tables)
Upvotes: 4