kabtron
kabtron

Reputation: 33

Create dictionary based on matching terms from two other dictionaries - Python

I'm trying to compare two large dictionaries that describe the contents of product catalogs. Each dictionary consists of a unique, coded key and a list of terms for each key.

dict1 = {
"SKU001": ["Plumbing", "Pumps"], 
"SKU002": ["Motors"], 
"SKU003": ["Snow", "Blowers"], 
"SKU004": ["Pnuematic", "Hose", "Pumps"],
...
}

dict2 = {
"FAS001": ["Pnuematic", "Pumps"], 
"GRA001": ["Lawn", "Mowers"], 
"FAS002": ["Servo", "Motors"], 
"FAS003": ["Hose"], 
"GRA002": ["Snow", "Shovels"], 
"GRA003": ["Water", "Pumps"]
...
}

I want to create a new dictionary that borrows the keys from dict1 and whose values are a list of keys from dict2 where at least one of their term values match. The ideal end result may resemble this:

match_dict = {
"SKU001": ["FAS001", "GRA003"], 
"SKU002": ["FAS002"], 
"SKU003": ["GRA002"], 
"SKU004": ["FAS001", "FAS003", "GRA003], 
...
}

I'm having issues creating this output though. Is it possible to create a list of keys and assign it as a value to another key? I've made a few attempts using nested loops like below, but the output isn't as desired and I'm unsure if it's even working properly. Any help is appreciated!

matches = {}
for key, values in dict1.items():
    for value in values:
        if value in dict2.values():
            matches[key] = value
print(matches)

Upvotes: 2

Views: 956

Answers (2)

Alain T.
Alain T.

Reputation: 42143

Assuming that dict1 and dict2 can have duplicate value entries, you would need to build an intermediate multi-map dictionary and also handle uniqueness of the expanded value list for each SKU:

mapDict = dict()
for prod,attributes in dict2.items():
    for attribute in attributes:
        mapDict.setdefault(attribute,[]).append(prod)
matchDict = dict()
for sku,attributes in dict1.items():
    for attribute in attributes:
        matchDict.setdefault(sku,set()).update(mapDict.get(attribute,[]))
matchDict = { sku:sorted(prods) for sku,prods in matchDict.items() }

print(matchDict)

{'SKU001': ['FAS001', 'GRA003'], 'SKU002': ['FAS002'], 'SKU003': ['GRA002'], 'SKU004': ['FAS001', 'FAS003', 'GRA003']}

Upvotes: 0

javidcf
javidcf

Reputation: 59711

This is one possible implementation:

dict1 = {
    "SKU001": ["Plumbing", "Pumps"], 
    "SKU002": ["Motors"], 
    "SKU003": ["Snow", "Blowers"], 
    "SKU004": ["Pnuematic", "Hose", "Pumps"],
}
dict2 = {
    "FAS001": ["Pnuematic", "Pumps"], 
    "GRA001": ["Lawn", "Mowers"], 
    "FAS002": ["Servo", "Motors"], 
    "FAS003": ["Hose"], 
    "GRA002": ["Snow", "Shovels"], 
    "GRA003": ["Water", "Pumps"]
}
match_dict_test = {
    "SKU001": ["FAS001", "GRA003"], 
    "SKU002": ["FAS002"], 
    "SKU003": ["GRA002"], 
    "SKU004": ["FAS001", "FAS003", "GRA003"], 
}

# Find keys for each item in dict2
dict2_reverse = {}
for k, v in dict2.items():
    for item in v:
        dict2_reverse.setdefault(item, []).append(k)
# Build dict of matches
match_dict = {}
for k, v in dict1.items():
    # Keys in dict2 associated to each item
    keys2 = (dict2_reverse.get(item, []) for item in v)
    # Save sorted list of keys from dict2 without repetitions
    match_dict[k] = sorted(set(k2i for k2 in keys2 for k2i in k2))
# Check result
print(match_dict == match_dict_test)
# True

Upvotes: 2

Related Questions