Reputation: 33223
Not sure whether this is the right place, but I have a question related to algorithm and I cant think of an efficient algorithm. So thought of sharing my problem statement.. :) To ease up what I am trying to explain, let me create a hypothetical example.
Suppose, I have a list which contains an object whcih contains two things..
lets say product id and price
Now, this is a long long list..sort of like an inventory.. out of this I have defined three price segments.. lowprice, midprice and highprice and then k1,k2,k3 where k1,k2 and k3 are ratios. So, the job is now,, I have to gather products from this huge inventory in such a way that there is n1 products from lowprice range, n2 products from midprice range and n3 products from high price range... where n1:n2:n3 == k1:k2:k3
Now, how do I efficiently achieve the following. I target the low price point is 100 dollars and I have to gather 20 products from this range.. mid price range is probably 500 dollars and so on
So I start with 100 dollars.. and then look for items between 90 and 100 and also between 100 and 110 Let say I found 5 products in interval 1 low (90,100) and 2 products in interval 1 high (100,110) Then, I go to next low interval and next high interval. I keep on doing this until I get the number of products in this interval.
How do I do this?? Also there might be case, when the number of products in a particular price range is less than what I need.. (maybe mid price range is 105 dollars...).. so what should I do in that case.. Please pardon me, if this is not the right platform.. as from the question you can make out that this is more like a debative question rather than the "I am getting this error" type of question. Thanks
Upvotes: 3
Views: 217
Reputation: 32502
Approach 1
If the assignment what products belong to the three price segments never changes, why don't you simply build 3 lists, one for the products in each price segment (assuming these sets are disjoint). Then you may pick from these lists randomly (either with or without replacement - as you like). The number of items for each class is given by the ratios.
Approach 2
If the product-price-segment assignment is intended to be pre-specified, e.g., by passing corresponding price values for each segment on function call, you may want to have the products sorted by price and use a binary search to select the m-nearest-neighbors (for example). The parameter m
could be specified according to the ratios. If you specify a maximum distance you may reject products that are outside the desired price range.
Approach 3
If the product-price-segment assignment needs to be determined autonomously, you could apply your clustering algorithm of choice, e.g., k-means, to assign your products to, say, k = 3
price segments. For the actual product selection you may proceed similarly as described above.
Upvotes: 2
Reputation: 178431
You are probably looking for selection algorithm.
First find the n1
'th smallest element, let it be e1
, and the lower bound list is all elements such that element <= e1
.
Do the same for the other ranges.
pseudo code for lower bound list:
getLowerRange(list,n):
e <- select(list,n)
result <- []
for each element in list:
if element <= e:
result.append(element)
return result
Note that this solution fails if there are many "identical" items [result will be a bigger list], but finding those items, and removing it from result list is not hard.
Note that selection algorithm is O(n)
, so this algorithm will consume linear time related to your list's size.
Upvotes: 2
Reputation: 5901
It's seems like you should try a database solution rather then using a list. Check out sqlite. It's in Python by default
Upvotes: 0