Reputation: 4482
I found several posts about flattening/collapsing lists in Python, but none which cover this case:
Input:
[a_key_1, a_key_2, a_value_1, a_value_2]
[b_key_1, b_key_2, b_value_1, b_value_2]
[a_key_1, a_key_2 a_value_3, a_value_4]
[a_key_1, a_key_3, a_value_5, a_value_6]
Output:
[a_key_1, a_key_2, [a_value1, a_value3], [a_value_2, a_value_4]]
[b_key_1, b_key_2, [b_value1], [b_value_2]]
[a_key_1, a_key_3, [a_value_5], [a_value_6]]
I want to flatten the lists so there is only one entry per unique set of keys and the remaining values are combined into nested lists next to those unique keys.
EDIT: The first two elements in the input will always be the keys; the last two elements will always be the values.
Is this possible?
Upvotes: 0
Views: 160
Reputation: 198334
data = [
["a_key_1", "a_key_2", "a_value_1", "a_value_2"],
["b_key_1", "b_key_2", "b_value_1", "b_value_2"],
["a_key_1", "a_key_2", "a_value_3", "a_value_4"],
["a_key_1", "a_key_3", "a_value_5", "a_value_6"],
]
from itertools import groupby
keyfunc = lambda row: (row[0], row[1])
print [
list(key) + [list(zipped) for zipped in zip(*group)[2:]]
for key, group
in groupby(sorted(data, key=keyfunc), keyfunc)
]
# => [['a_key_1', 'a_key_2', ['a_value_1', 'a_value_3'], ['a_value_2', 'a_value_4']],
# ['a_key_1', 'a_key_3', ['a_value_5'], ['a_value_6']],
# ['b_key_1', 'b_key_2', ['b_value_1'], ['b_value_2']]]
For more information check the Python Docs
Upvotes: 1
Reputation: 424
Yes, it's possible. Here's a function (with doctest from your input/output) that performs the task:
#!/usr/bin/env python
"""Flatten lists as per http://stackoverflow.com/q/30387083/253599."""
from collections import OrderedDict
def flatten(key_length, *args):
"""
Take lists having key elements and collect remainder into result.
>>> flatten(1,
... ['A', 'a1', 'a2'],
... ['B', 'b1', 'b2'],
... ['A', 'a3', 'a4'])
[['A', ['a1', 'a2'], ['a3', 'a4']], ['B', ['b1', 'b2']]]
>>> flatten(2,
... ['A1', 'A2', 'a1', 'a2'],
... ['B1', 'B2', 'b1', 'b2'],
... ['A1', 'A2', 'a3', 'a4'],
... ['A1', 'A3', 'a5', 'a6'])
[['A1', 'A2', ['a1', 'a2'], ['a3', 'a4']], ['B1', 'B2', ['b1', 'b2']], ['A1', 'A3', ['a5', 'a6']]]
"""
result = OrderedDict()
for vals in args:
result.setdefault(
tuple(vals[:key_length]), [],
).append(vals[key_length:])
return [
list(key) + list(vals)
for key, vals
in result.items()
]
if __name__ == '__main__':
import doctest
doctest.testmod()
(Edited to work with both your original question and the edited question)
Upvotes: 3