Reputation: 1108
I have some solutions to this problem, but I am awarre that somewhere out there lies an elegant solution, maybe a two liner.
I have a huge selection (M) of items, basically dictionaries with numerical features like: ItemOne = {width:5, height:10, cost:200,...}
I would like to split this set of dictionaries/items in groups of N (2, 3,...) so that the differences between i.e. width, height or other features should be kept to a minimum according to a criterion (I was thinking a sum of squared differences). The part with the criterion isn't a problem, I just have trouble figuring out the nicest way to split the dataset and get all the combinations without repeating the subsets.
Upvotes: 1
Views: 190
Reputation: 5830
not super clear what you are asking, but I'll give it a shot
#example items
items = [{'width':5, 'height':10, 'cost':200}, {'width':6, 'height':9, 'cost':2}]
#whatever you want your criteria to be
def calculate_criteria(item):
return item['width']+item['height']+item['cost']
# create subsets based on criterion
subsets = {}
for item in items:
criteria = calculate_criteria(item)
subset = subsets.get(criteria, list())
subset.append(item)
subsets[criteria] = subset
print subsets
{17: [{'width': 6, 'cost': 2, 'height': 9}], 215: [{'width': 5, 'cost': 200, 'height': 10}]}
or even better using collections.defaultdict
# create subsets based on criterion
subsets = collections.defaultdict(list)
for item in items:
subsets[calculate_criteria(item)].append(item)
Upvotes: 1