Reputation: 2543
I have a Dict
whs = {
'ID1' : ['code1', 'code2', 'code3'],
'ID2' : ['code2', 'code5', 'code3'],
'ID3' : ['code6', 'code7', 'code8'],
'ID4' : ['code3', 'code5', 'code6'],
}
What I need to do is build a new list that will look like
submit = [
{
'codes' : ['code3', ],
'ids' : ['ID1', 'ID2', 'ID4'],
},
{
'codes' : ['code6', 'code7', 'code8'],
'ids' : ['ID3', ],
}
]
What I have so far
def ParseAvailable(self, whs):
separate = whs.keys()
submit = []
while len(separate) > 0:
avail = {
'codes' : [],
'ids' : [],
}
for num, item in enumerate(separate):
if len(avail['codes']) == 0:
avail['codes'] = whs[item]
avail['ids'].append(item)
else:
avail_all = list(set(avail['codes']) & set(whs[item]))
print '%s : %s' % (item, avail_all)
if len(avail_all) > 0:
avail['codes'] = avail_all
avail['ids'].append(item)
if len(avail['codes']) > 0:
del separate[num]
submit.append(avail)
return submit
Which returns:
[
{
'ids': ['ID4', 'ID3'],
'codes': ['code6']
},
{
'ids': ['ID2'],
'codes': ['code2', 'code5', 'code3']
},
{
'ids': ['ID1'],
'codes': ['code1', 'code2', 'code3']
}
]
which COULD work except that ID1 & ID2 should be combined as
{
'ids' : ['ID1', 'ID2',],
'codes' : ['code2', 'code3', ]
}
Curious if there is an easier approach that I've not thought of, figure I could setup a couple more nested loops to compare everything piece by piece though it seems rather unpythonic
Thank You in Advance
Upvotes: 1
Views: 243
Reputation: 3956
As others said, this is an optimization problem that isn't trivial for large datasets. But as a first step, it's at least easy to invert the list of warehouses per product to a list of products per warehouse.
whs = {
'ID1' : ['code1', 'code2', 'code3'],
'ID2' : ['code2', 'code5', 'code3'],
'ID3' : ['code6', 'code7', 'code8'],
'ID4' : ['code3', 'code5', 'code6'],
}
a = []
b = []
for k,v in whs.items():
for j in v:
a.append((j,k))
b.append((j,[]))
inv = dict(b)
for j in a:
inv[j[0]].append(j[1])
print("Inventory:")
for k,v in inv.items():
print(k,v)
When run, this prints:
Inventory:
code1 ['ID1']
code2 ['ID2', 'ID1']
code3 ['ID4', 'ID2', 'ID1']
code5 ['ID4', 'ID2']
code6 ['ID4', 'ID3']
code7 ['ID3']
code8 ['ID3']
As you can see from the trivial example, there are no solutions that will get all products in a single order, and several solutions (code2+code6, code3+code6, code3+code7, code3+code8) that will get them in two orders. Picking the "best" would require additional information, such as pricing, to optimize on.
Upvotes: 0
Reputation: 8427
I attacked it by building a tree of all the potential additions, and then finding the cheapest option among those. Here is a working (albeit ugly and unoptimized) example:
https://gist.github.com/1288835
The tree will end up with p*w nodes, where p is the number of products and w(p) is the average number of warehouses per product.
Upvotes: 2
Reputation: 348
So it looks to me like you have a set cover problem, which is NP-complete. Not only that, but it appears that you want all the possible set covers of the minimum size. Your list of ids creates the universe set, and each of your codes creates a sub set of the universe.
For your simple iterative loops you don't really have a choice other then to try all possible combinations of codes. If you have large data sets this isn't going to work.
The wikipedia article mentions an efficient algorithm (greedy algorithm) that isn't guaranteed to find the best solution, but the error in the solution is bounded. This algorithm is to iteratively add the subset the contains the most uncovered elements of the universe.
If you need to get the absolutely correct answer you may want to try the Integer Programming approach (as in Linear Programming). There are software packages and libraries for this algorithm.
Upvotes: 0