erp
erp

Reputation: 3014

Aggregate unique values in list to new list

Trying to find the best way to aggregate values (value pairs) from a list in python.

foo = [
    {'color': 'yellow', 'type': 'foo'},
    {'color': 'yellow', 'type': 'bar'},
    {'color': 'red', 'type': 'foo'},
    {'color': 'red', 'type': 'foo'},
    {'color': 'green', 'type': 'foo'},
    {'color': 'red', 'type': 'bar'}
]

end goal is something like

newFoo = [
    {'color': 'yellow', 'type': 'foo', 'count': 1},
    {'color': 'yellow', 'type': 'bar', 'count': 1},
    {'color': 'red', 'type': 'foo', 'count': 2},
    {'color': 'red', 'type': 'bar', 'count': 1},
    {'color': 'green', 'type': 'foo', 'count': 1}
]

I'm not very good with python but have been trying to accomplish it sort of but this is about as far as I can get:

def loop(ar):
    dik = []
    for line in ar:
        blah = []

        for k,v in line.items():
            blah.append({k,v})
        blah.append({'count':'1'})
    dik.append(blah)
    print(dik)

any help appreciated.

Upvotes: 1

Views: 211

Answers (4)

Jeremy Kahan
Jeremy Kahan

Reputation: 3826

I tried to work with your original code as much as I could. What I added was to sort things and then track whether each item matched the preceding one.

`# list sort/count routine`
    def loop(ar):
        dik = []
        ar.sort() #this way we need only check the preceding one for a repeat
        #it does give the list sorted, which we believe is harmless
        blah={'color': '', 'type': '', 'count':0} #initialize blah to something that will not match
        for line in ar:
            if (blah['color']==line['color'])and (blah['type']==line['type']):
                blah['count']=blah['count']+1 #still accumulating count in blah
            else:#first of this one
                if (blah['color'])!='':#add previous one, if any
                    dik.append(blah)
                blah={'color': line['color'], 'type': line['type'], 'count':1}
        if (blah['color'])!='':#add the last one
                    dik.append(blah)
        return dik

    foo = [
        {'color': 'yellow', 'type': 'foo'},
        {'color': 'yellow', 'type': 'bar'},
        {'color': 'red', 'type': 'foo'},
        {'color': 'red', 'type': 'foo'},
        {'color': 'green', 'type': 'foo'},
        {'color': 'red', 'type': 'bar'}
    ]

    newFoo = loop(foo)
    print newFoo`

Upvotes: 0

Saelyth
Saelyth

Reputation: 1734

Haha, this took me longer that I want to admit and there is a lot of better answers, but I did this in an old-fashion way and maybe this helps you understand how to achieve it without fancy libraries.

# You clone the list before making any checks, 
# because you can't iterate an empty list.
new_foo = foo 

for old in foo: # for each item in the old list
    for new in new_foo: # we make a check to find that item in the new one
        if old['type'] == new['type'] and old['color'] == new['color']: # and if those 2 keys match
            if not 'count' in new: # we try to find the count key
                new['count'] = 1 # add it if it wasn't found
            else:
                new['count'] = new['count'] + 1 # sum 1 if it was found
            break # and then stop looking, break the 2nd loop.

That should add counts on every item that we want to count. However, it leaves the repeated ones without a count key.

{'color': 'yellow', 'type': 'foo', 'count': 1}
{'color': 'yellow', 'type': 'bar', 'count': 1}
{'color': 'red', 'type': 'foo', 'count': 2}
{'color': 'red', 'type': 'foo'}
{'color': 'green', 'type': 'foo', 'count': 1}
{'color': 'red', 'type': 'bar', 'count': 1}

As we cloned the list as first thing, sadly those still exist in our new list so let's use that to filter them out.

for item in new_foo:
    if not 'count' in item:
        new_foo.remove(item)

Result:

{'color': 'yellow', 'type': 'foo', 'count': 1}
{'color': 'yellow', 'type': 'bar', 'count': 1}
{'color': 'red', 'type': 'foo', 'count': 2}
{'color': 'green', 'type': 'foo', 'count': 1}
{'color': 'red', 'type': 'bar', 'count': 1}

I am aware that there are better answers, but I think understanding the basics is important before dealing with advanced technics. We can check keys in dicts and add a key to a dict easily this way:

if 'my_made_up_key' in my_dict: # check if exists

my_dict['my_made_up_key'] = my_value # add new key to a dict

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195448

You can use Counter from collections:

from collections import Counter
from pprint import pprint

foo = [
    {'color': 'yellow', 'type': 'foo'},
    {'color': 'yellow', 'type': 'bar'},
    {'color': 'red', 'type': 'foo'},
    {'color': 'red', 'type': 'foo'},
    {'color': 'green', 'type': 'foo'},
    {'color': 'red', 'type': 'bar'}
]

c = Counter( tuple( (i['color'], i['type']) for i in foo))
pprint([{'color': k[0], 'type': k[1], 'count': v} for k, v in c.items()])

Output:

[{'color': 'yellow', 'count': 1, 'type': 'foo'},
 {'color': 'yellow', 'count': 1, 'type': 'bar'},
 {'color': 'red', 'count': 2, 'type': 'foo'},
 {'color': 'green', 'count': 1, 'type': 'foo'},
 {'color': 'red', 'count': 1, 'type': 'bar'}]

Edit:

If you want to sort the new list, you can do something like this:

l = sorted(newFoo, key=lambda v: (v['color'], v['type']), reverse=True)
pprint(l)

Will print:

[{'color': 'yellow', 'count': 1, 'type': 'foo'},
 {'color': 'yellow', 'count': 1, 'type': 'bar'},
 {'color': 'red', 'count': 2, 'type': 'foo'},
 {'color': 'red', 'count': 1, 'type': 'bar'},
 {'color': 'green', 'count': 1, 'type': 'foo'}]

Edit:

Thanks to @MadPhysicist, you can generalize the above example:

c = Counter(tuple(item for item in i.items()) for i in foo)
pprint([{**dict(k), 'count': v} for k, v in c.items()])

Upvotes: 2

bphi
bphi

Reputation: 3195

Here's an easy option if you don't mind duplicates. If you want only one record, Andrej's answer with Counter is great.

newFoo = [dict(d, **{'count': foo.count(d)}) for d in foo]
>>> newFoo

[{'color': 'yellow', 'type': 'foo', 'count': 1}, 
 {'color': 'yellow', 'type': 'bar', 'count': 1}, 
 {'color': 'red', 'type': 'foo', 'count': 2}, 
 {'color': 'red', 'type': 'foo', 'count': 2}, 
 {'color': 'green', 'type': 'foo', 'count': 1},
 {'color': 'red', 'type': 'bar', 'count': 1}]

Upvotes: 1

Related Questions