Summing duplicate purchases

Question

I am trying to write a query that selects from a list of employee_id and find duplicate book purchases (book_id) and associated cost savings (list_price). If a duplicate exists, it needs sum the prices of the amount of duplicate book_id's.

So if someone has a book costing $10 associated to their employee_id and the book is offered to them again, they don't have to buy it and there is a savings of $10. If that happens again, there's a savings of $20.

I tried a having>1 but I can't seem to get the query correct to accurately sum the savings.

Any help is appreciated.

jwolf · Accepted Answer

To start,

select employee_id, book_id, count(*)
from book_purchases
group by employee_id, book_id
having count(*) > 1

gets you the list you need.

If we don't have to worry about the price changing, then we just add a column or two more to get:

select employee_id, book_id, 
    count(*) as copies_purchased, 
    sum(list_price) as total_spent,
    count(*) - 1 as copies_unnecessarily_purchased, 
    (count(*) - 1) * avg(list_price) as amount_overspent
from book_purchases
group by employee_id, book_id
having count(*) > 1

Of course you can join to the employee and book tables to get names and titles to fat out the results a bit.

To get the total amount overspent by each employee, you could wrap the above query thusly:

select a.employee_id, sum(a.amount_overspent) as total_amount_overspent
from (
        select employee_id, book_id, 
            count(*) as copies_purchased, 
            sum(list_price) as total_spent,
            count(*) - 1 as copies_unnecessarily_purchased, 
            (count(*) - 1) * avg(list_price) as amount_overspent
        from book_purchases
        group by employee_id, book_id
        having count(*) > 1    
    ) as a 
group by a.employee_id

Lastly, I went ahead and joined to an employee table that I presumed you have while I was at it:

select a.employee_id, emp.employee_name, sum(a.amount_overspent) as total_amount_overspent
from (
        select employee_id, book_id, 
            count(*) as copies_purchased, 
            sum(list_price) as total_spent,
            count(*) - 1 as copies_unnecessarily_purchased, 
            (count(*) - 1) * avg(list_price) as amount_overspent
        from book_purchases
        group by employee_id, book_id
        having count(*) > 1    
    ) as a 
inner join employee as emp on emp.employee_id = a.employee_id
group by a.employee_id, emp.employee_name

To be clear, these aren't four separate queries; they're just intermediate stages in building the single query you see at the end.

I hope this helps.

Summing duplicate purchases

Answers (1)

Related Questions