bhux
bhux

Reputation: 43

Combine two lists of dicts, adding the values together

I want to combine two lists of multiple dicts into a new list of dicts, appending new dicts to the final list, and adding together the 'views' values if encountered.

a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]

b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

And the desired output would be:

c = [{'title': 'Learning How to Program', 'views': 8,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 5,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

I found Is there any pythonic way to combine two dicts (adding values for keys that appear in both)? -- however I do not understand how to get the desired output in my situation, having two lists of multiple dicts.

Upvotes: 3

Views: 1702

Answers (7)

Mike
Mike

Reputation: 466

Assuming that you don't want to title it as "title" and "views". More professional way is to write it this way:

  def combing(x):
     result = {}
     for i in x:
        h = i.values()
        result[h[0]] = result.get(h[0],0)+ h[1]
     return result

combing([{'item': 'item1', 'amount': 400}, {'item': 'item2', 'amount': 
300}, {'item': 'item1', 'amount': 750}])

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1121744

You need to convert your input dictionaries to (title: count) pairs, using them as keys and values in a Counter; then after summing, you can convert these back to your old format:

from collections import Counter

summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
c = [{'title': title, 'views': counts} for title, counts in summed.items()]

Demo:

>>> from collections import Counter
>>> a = [{'title': 'Learning How to Program', 'views': 1},
...      {'title': 'Mastering Programming', 'views': 3}]
>>> b = [{'title': 'Learning How to Program', 'views': 7},
...      {'title': 'Mastering Programming', 'views': 2},
...      {'title': 'Programming Fundamentals', 'views': 1}]
>>> summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
>>> summed
Counter({'Learning How to Program': 8, 'Mastering Programming': 5, 'Programming Fundamentals': 1})
>>> [{'title': title, 'views': counts} for title, counts in summed.items()]
[{'views': 8, 'title': 'Learning How to Program'}, {'views': 5, 'title': 'Mastering Programming'}, {'views': 1, 'title': 'Programming Fundamentals'}]

The goal here is to have a unique identifier per count. If your dictionaries are more complex, you either need to convert the whole dictionary (minus the count) to a unique identifier, or pick one of the values from the dictionary to be that identifier. Then sum the view counts per identifier.

From your updated example, the URL would be a good identifier. That'd let you collect the view count in place:

per_url = {}
for entry in a + b:
    key = entry['url']
    if key not in per_url:
        per_url[key] = entry.copy()
    else:
        per_url[key]['views'] += entry['views']

c = per_url.values()  # use list(per_url.values()) on Python 3

This simply uses the dictionaries themselves (or at least a copy of the first one encountered) to sum the view counts:

>>> from pprint import pprint
>>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
>>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
>>> per_url = {}
>>> for entry in a + b:
...     key = entry['url']
...     if key not in per_url:
...         per_url[key] = entry.copy()
...     else:
...         per_url[key]['views'] += entry['views']
... 
>>> per_url
{'/93hB': {'url': '/93hB', 'title': 'Programming Fundamentals', 'slug': 'programming-fundamentals', 'views': 1}, '/4XvR': {'url': '/4XvR', 'title': 'Learning How to Program', 'slug': 'learning-how-to-program', 'views': 8}, '/7XqR': {'url': '/7XqR', 'title': 'Mastering Programming', 'slug': 'mastering-programming', 'views': 5}}
>>> pprint(per_url.values())
[{'slug': 'programming-fundamentals',
  'title': 'Programming Fundamentals',
  'url': '/93hB',
  'views': 1},
 {'slug': 'learning-how-to-program',
  'title': 'Learning How to Program',
  'url': '/4XvR',
  'views': 8},
 {'slug': 'mastering-programming',
  'title': 'Mastering Programming',
  'url': '/7XqR',
  'views': 5}]

Upvotes: 2

sirfz
sirfz

Reputation: 4277

A simple function that does what you need for any given number of lists:

import itertools
from collections import Counter, OrderedDict

def sum_views(*lists):
    views = Counter()
    docs = OrderedDict()  # to preserve input order
    for doc in itertools.chain(*lists):
        slug = doc['slug']
        views[slug] += doc['views']
        docs[slug] = dict(doc)   # shallow copy of original dict
        docs[slug]['views'] = views[slug]
    return docs.values()

Upvotes: 0

Stefan Pochmann
Stefan Pochmann

Reputation: 28596

Here's a simple one. Walks over all entries, copies an entry the first time it's encountered, and adds the views in subsequent encounters:

summary = {}    
for entry in a + b:
    key = entry['url']
    if key not in summary:
        summary[key] = entry.copy()
    else:
        summary[key]['views'] += entry['views']
c = list(summary.values())

Upvotes: 1

Ronoaldo Pereira
Ronoaldo Pereira

Reputation: 667

Non-optimal, but works:

>>> from collections import Counter
>>> from pprint import pprint
>>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
>>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
>>> summed = sum((Counter({x['slug']: x['views']}) for x in a+b), Counter())
>>> c = dict()
>>> _ = [c.update({x['slug']: x}) for x in a + b]
>>> _ = [c[x].update({'views': summed[x]}) for x in c.keys()]
>>> pprint(c.values())
[{'slug': 'mastering-programming',
  'title': 'Mastering Programming',
  'url': '/7XqR',
  'views': 5},
 {'slug': 'programming-fundamentals',
  'title': 'Programming Fundamentals',
  'url': '/93hB',
  'views': 1},
 {'slug': 'learning-how-to-program',
  'title': 'Learning How to Program',
  'url': '/4XvR',
  'views': 8}]

Based on the Counter idea from Martijn with some more iterations to update the counter values with the other attributes, assuming they don't change.

Note that there are some "encrypted" loops in the generators...

Upvotes: 0

Farmer Joe
Farmer Joe

Reputation: 6070

It might may not be the most pythonic solution:

def coalesce(d1,d2):
    combined = [i for i in d1]
    for d in d2:
        found = False
        for itr in combined:          
            if itr['title'] == d['title']:
                itr['views'] += d['views']
                found = True
                break
        if not found:
             combined.append(d)
     return combined

Upvotes: 0

Tuan Anh Hoang-Vu
Tuan Anh Hoang-Vu

Reputation: 1995

First, you need to convert your inputs into dicts, for example

b = {'Learning How to Program': 7,
     'Mastering Programming': 2,
     'Programming Fundamentals': 1}

After that, apply the solution you found, then convert it back to list of dicts.

Upvotes: 1

Related Questions