konrad
konrad

Reputation: 3706

grouping and sorting lists using python

I am trying to use one list to sort another and keep them synchronized at the same time:

keys = [x,x,x,y,y,x,x,z,z,z,x,x]
data = [1,2,3,4,5,6,7,8,9,10,11,12]

I want to use the keys list to organize the data list into subgroups of the same keys.

result = [[1,2,3,6,7,11,12],[4,5,],[8,9,10]]

I also want to make sure that the list is sorted within each subgroup.

so far i was able to get it all sorted properly:

group = []

data = sorted(zip(data, keys), key=lambda x: (x[1]))
for i, grp in groupby(data, lambda x: x[1]):
    sub_group = [], []
    for j in grp:
        sub_group.append(j[1])
    group.extend(sub_group)

What else am I missing? Thanks!

Upvotes: 1

Views: 1686

Answers (4)

Jeremy Allen
Jeremy Allen

Reputation: 6614

Beware that OrderedDict orders the keys by their insertion order, and not as if the keys were sorted after the fact. If the 'keys' list were not in the required order you would not get the intended result.

My solution:

from collections import defaultdict

keys = ['x', 'x', 'x', 'y', 'y', 'x', 'x', 'z', 'z', 'z', 'x', 'x']
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

# group 'data' values by 'key'
grouped = defaultdict(list)
for key, data in zip(keys, data):
    grouped[key].append(data)

# construct the final list of subgroups
# contents of each subgroup must be sorted
# also sorting the keys so that the 'x' subgroup comes before the 'y' subgroup etc
grouped_and_ordered = [sorted(grouped[key]) for key in sorted(grouped.keys())]

Upvotes: 0

emesday
emesday

Reputation: 6186

You have almost done. Try this code

group = []
data = sorted(zip(data, keys), key=lambda x: (x[1]))
for i, grp in groupby(data, lambda x: x[1]):
    group.append([item[0] for item in grp])

grp has (data, key) pair, so you need to select data from the pair as [item[0] for item in grp]

UPDATED

This code I used for answer.

from itertools import groupby

x, y, z = range(3)
keys = [x,x,x,y,y,x,x,z,z,z,x,x]
data = [1,2,3,4,5,6,7,8,9,10,11,12]

group = []
data = sorted(zip(data, keys), key=lambda x: (x[1]))
for i, grp in groupby(data, lambda x: x[1]):
    group.append([item[0] for item in grp])

print group

Upvotes: 2

wwii
wwii

Reputation: 23753

OrderedDict may well be a better option but ....

import itertools as it
from operator import itemgetter
x = 1
y = 2
z = 3
keys = [x,x,x,y,y,x,x,z,z,z,x,x]
data = [1,2,3,4,5,6,7,8,9,10,11,12]
key = itemgetter(1)
value = itemgetter(0)

data = sorted(zip(data, keys), key=key)
print [map(value, grp) for k, grp in it.groupby(data, key)]

Upvotes: 1

user2555451
user2555451

Reputation:

It would much simpler if you used collections.OrderedDict and its setdefault method:

from collections import OrderedDict

# To demonstrate, I made the keys into strings
keys = ['x', 'x', 'x', 'y', 'y', 'x', 'x', 'z', 'z', 'z', 'x', 'x']
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

dct = OrderedDict()
for key,val in zip(keys, data):
    dct.setdefault(key, []).append(val)

print(dct)
print(list(dct.values()))

Output:

OrderedDict([('x', [1, 2, 3, 6, 7, 11, 12]), ('y', [4, 5]), ('z', [8, 9, 10])])
[[1, 2, 3, 6, 7, 11, 12], [4, 5], [8, 9, 10]]

Upvotes: 2

Related Questions