adohertyd
adohertyd

Reputation: 2689

Merging different dictionaries in Python

This is a long question so please bear with me. I start out with 3 dicts obtained from 3 APIs. the dicts have a structure like so:

API1 = {'results':[{'url':'www.site.com','title':'A great site','snippet':'This is a great site'},
{'url':'www.othersite.com','title':'Another site','snippet':'This is another site'},
{'url':'www.wiki.com','title':'A wiki site','snippet':'This is a wiki site'}]}

API2 = {'hits':[{'url':'www.dol.com','title':'The DOL site','snippet':'This is the dol site'},
{'url':'www.othersite.com','title':'Another site','snippet':'This is another site'},
{'url':'www.whatever.com','title':'Whatever site','snippet':'This is a site about whatever'}]}

API3 = {'output':[{'url':'www.dol.com','title':'The DOL site','snippet':'This is the dol site'},
{'url':'www.whatever.com','title':'Whatever site','snippet':'This is a site about whatever'},
{'url':'www.wiki.com','title':'A wiki site','snippet':'This is a wiki site'}]}

I extract the URL keys from API1, API2 and API3 to do some processing. I do this because there is quite a bit of processing to be done and only the URLs are needed. When finished I have a list of the URL's with the duplicates removed and another list of scores that are relative to each URL's position in the list:

URLlist = ['www.site.com','www.wiki.com','www.othersite.com','www.dol.com','www.whatever.com']

Results = [1.2, 6.5, 3.5, 2.1, 4.0]

What I have done is created a new dictionary from these 2 lists using the zip() function.

ScoredResults = dict(zip(URLlist,Results))

{'www.site.com':1.2,'www.wiki.com':6.5, 'www.othersite.com':3.5, 'www.dol.com':2.1, 'www.whatever.com':4.0}

Now what I need to do is to link the URL's from ScoredResults with API1,API2 or API3 so that I have a new dictionary like so:

Full Results = 
{'www.site.com':{'title':'A great site','snippet':'This is a great site','score':1.2},
 'www.othersite.com':{'title':'Another site','snippet':'This is another site','score':3.5},
...}

This is too difficult for me to do. If you look back on my question history I have been asking numerous dictionary questions but no implementation has worked so far. If anyone could please point me in the right direction I would very much appreciate it.

Upvotes: 1

Views: 144

Answers (4)

Pierre GM
Pierre GM

Reputation: 20339

Would something like that work for you ? It's rather basic, constructing your final dictionary by looping on URLlist.

API1r = API1['results']
API2r = API2['hits']
API3r = API3['output']

FullResults = {}
for (U, R) in zip(URLlist, Results):
    FullResults[U] = {}
    for api in (API1r, API2r, API3r):
        for v in api:
            k = dict()
            k.update(v)
            if (k.pop('url') == U):
                FullResults[U].update((k.items()+[('score', R)]))

Note that as the same url may be present in your different APIs but with different information, we need to create the corresponding entry in FullResults beforehand, so it might be a bit tricky to simplify the loops. LMKHIW.

Upvotes: 1

mgilson
mgilson

Reputation: 309841

I would transform the API's into something that is more meaningful for you. A dict of urls is probably more appropriate:

def transform_API(API):
    list_of_dict=API.get('results',API.get('hits',API.get('output')))
    if(list_of_dict is None):
       raise KeyError("results, hits or output not in API")
    d={}
    for dct in list_of_dict:
        d[dct['url']]=dct
        dct.pop('url')
    return d

API1=transform_API(API1)
API2=transform_API(API2)
API3=transform_API(API3)

master={}
for d in (API1,API2,API3):
    master.update(d)

urls=list(master.keys())
scores=get_scores_from_urls(urls)

for k,score in zip(urls,scores):
    master[k]['score']=score

Upvotes: 1

Jon Clements
Jon Clements

Reputation: 142126

A quick attempt:

from itertools import chain

full_result = {}

for blah in chain.from_iterable(d.itervalues() for d in (API1, API2, API3)):
    for d in blah:
        full_result[d['url']] = {
            'title': d['title'],
            'snippet': d['snippet'],
            'score': ScoredResults[d['url']]
        }

print full_result

Upvotes: 1

eumiro
eumiro

Reputation: 212835

With the given data…

Full_Results = {d['url']: {'title': d['title'], 'snippet': d['snippet'], 'score': ScoredResults[d['url']]} for d in API1['results']+API2['hits']+API3['output']}

resulting into:

{'www.dol.com': {'score': 2.1,
  'snippet': 'This is the dol site',
  'title': 'The DOL site'},
 'www.othersite.com': {'score': 3.5,
  'snippet': 'This is another site',
  'title': 'Another site'},
 'www.site.com': {'score': 1.2,
  'snippet': 'This is a great site',
  'title': 'A great site'},
 'www.whatever.com': {'score': 4.0,
  'snippet': 'This is a site about whatever',
  'title': 'Whatever site'},
 'www.wiki.com': {'score': 6.5,
  'snippet': 'This is a wiki site',
  'title': 'A wiki site'}}

Upvotes: 1

Related Questions