Reputation: 33
I'm trying to compare two large dictionaries that describe the contents of product catalogs. Each dictionary consists of a unique, coded key and a list of terms for each key.
dict1 = {
"SKU001": ["Plumbing", "Pumps"],
"SKU002": ["Motors"],
"SKU003": ["Snow", "Blowers"],
"SKU004": ["Pnuematic", "Hose", "Pumps"],
...
}
dict2 = {
"FAS001": ["Pnuematic", "Pumps"],
"GRA001": ["Lawn", "Mowers"],
"FAS002": ["Servo", "Motors"],
"FAS003": ["Hose"],
"GRA002": ["Snow", "Shovels"],
"GRA003": ["Water", "Pumps"]
...
}
I want to create a new dictionary that borrows the keys from dict1 and whose values are a list of keys from dict2 where at least one of their term values match. The ideal end result may resemble this:
match_dict = {
"SKU001": ["FAS001", "GRA003"],
"SKU002": ["FAS002"],
"SKU003": ["GRA002"],
"SKU004": ["FAS001", "FAS003", "GRA003],
...
}
I'm having issues creating this output though. Is it possible to create a list of keys and assign it as a value to another key? I've made a few attempts using nested loops like below, but the output isn't as desired and I'm unsure if it's even working properly. Any help is appreciated!
matches = {}
for key, values in dict1.items():
for value in values:
if value in dict2.values():
matches[key] = value
print(matches)
Upvotes: 2
Views: 956
Reputation: 42143
Assuming that dict1 and dict2 can have duplicate value entries, you would need to build an intermediate multi-map dictionary and also handle uniqueness of the expanded value list for each SKU:
mapDict = dict()
for prod,attributes in dict2.items():
for attribute in attributes:
mapDict.setdefault(attribute,[]).append(prod)
matchDict = dict()
for sku,attributes in dict1.items():
for attribute in attributes:
matchDict.setdefault(sku,set()).update(mapDict.get(attribute,[]))
matchDict = { sku:sorted(prods) for sku,prods in matchDict.items() }
print(matchDict)
{'SKU001': ['FAS001', 'GRA003'], 'SKU002': ['FAS002'], 'SKU003': ['GRA002'], 'SKU004': ['FAS001', 'FAS003', 'GRA003']}
Upvotes: 0
Reputation: 59711
This is one possible implementation:
dict1 = {
"SKU001": ["Plumbing", "Pumps"],
"SKU002": ["Motors"],
"SKU003": ["Snow", "Blowers"],
"SKU004": ["Pnuematic", "Hose", "Pumps"],
}
dict2 = {
"FAS001": ["Pnuematic", "Pumps"],
"GRA001": ["Lawn", "Mowers"],
"FAS002": ["Servo", "Motors"],
"FAS003": ["Hose"],
"GRA002": ["Snow", "Shovels"],
"GRA003": ["Water", "Pumps"]
}
match_dict_test = {
"SKU001": ["FAS001", "GRA003"],
"SKU002": ["FAS002"],
"SKU003": ["GRA002"],
"SKU004": ["FAS001", "FAS003", "GRA003"],
}
# Find keys for each item in dict2
dict2_reverse = {}
for k, v in dict2.items():
for item in v:
dict2_reverse.setdefault(item, []).append(k)
# Build dict of matches
match_dict = {}
for k, v in dict1.items():
# Keys in dict2 associated to each item
keys2 = (dict2_reverse.get(item, []) for item in v)
# Save sorted list of keys from dict2 without repetitions
match_dict[k] = sorted(set(k2i for k2 in keys2 for k2i in k2))
# Check result
print(match_dict == match_dict_test)
# True
Upvotes: 2