Find duplicate from list and sum

Question

data = [(0, 0, {'product_id': 6, 'qty': 1.0}), (0, 0, {'product_id': 8, 'qty': 1.0}), (0, 0, {'product_id': 7, 'qty': 2.0}), (0, 0, {'product_id': 6, 'qty': 1.0}), (0, 0, {'product_id': 8, 'qty': 1.0}), (0, 0, {'product_id': 7, 'qty': 2.0})]

I have this list, What i want to do is to find repeted product id and sum theire qty and remove duplicate product id elelment from list

Output of list should be:

 new_data = [(0, 0, {'product_id': 6, 'qty': 2.0}), (0, 0, {'product_id': 8, 'qty': 2.0}), (0, 0, {'product_id': 7, 'qty': 4.0})]

Josh Parnell · Accepted Answer

I think the simplest way is to build a dictionary (map) for your product ids, extract your data into that dictionary, then build your new data list. Example:

from collections import defaultdict
def mergeQty(data):
  qtyMap = defaultdict(float)
  for x, y, product in data:
    id = product['product_id']
    qty = product['qty']
    qtyMap[(x, y, id)] += qty

  return [(x, y, { 'product_id' : id, 'qty' : qty }) for (x, y, id), qty in qtyMap.iteritems()]

Note that this will not merge products whose first two values differ (in your example they are all 0's, and we can only guess at what those mean).

EDIT: Thanks Azat for defaultdict suggestion.

EDIT: Keeping unknown fields x and y intact as per kuro's suggestion.

Find duplicate from list and sum

Answers (2)

Related Questions