Reputation: 21
What is a good algorithm to use when you want to keep top ten best sellers and keep adjusting it as products are sold? I just got curious while studying for the interview.
A good example is how Amazon keeps most sold products ranking.
I thought they might use a sorting algorithm, but given that there are tons of products, it might be way too slow to sort every time when there is a change in how many times each product was sold because sorting takes average O(N log N). Or maybe they use a linked list to keep the order? If one product exceeded previous best seller, just put it in front of the linked list.
Upvotes: 1
Views: 476
Reputation: 18556
You could keep the best products in a balanced binary search tree (like std::set
or std::map
). Whenever a product is sold, you can increment it's number of sales (the way the product are mapped to the number of sales depends on the access pattern, but a hash table would work well in most cases) and if it's not in the tree insert it (you can remove it and reinsert into the tree if it's already there). After the insertion, you need to delete the product if there are more than k
(10 in this case) of them. The advantage of this approach is that we keep only k
best items in the tree, so the time complexity is O(log k)
, not O(log n)
per update.
However, a simple data structure (like a sorted vector or a sorted list) that contains the best k
items can also work well for small k
.
Upvotes: 1
Reputation: 1554
This is typically done with a priority queue implemented with a heap.
Upvotes: 4