lte__
lte__

Reputation: 7576

Python - Groups lists by one element

I'm trying to process data from a file input. A line contains 3 values separated by a whitespace. I'm trying to add them to a list to be grouped by the second value. So fe. I have the input:

qwe rty 12
asd fgh 34
zxc rty 96

and I want it to be stored in a variable like this:

variable =
[[[qwe, rty, 12], [zxc, rty, 96]],
[[asd, fgh, 34]]]

This is so that I can access it like this:

variable[0] #this should be [[qwe, rty, 12], [zxc rty, 96]]
variable[1] #this should be[[asd, fgh, 34]]

I'm trying

f = open('input_1.txt')
values = [] #to keep track of which values have occured before
data = []
for line in f:
    ldata = lineprocess(line) #this transforms the raw data to [qwe, rty, 12] etc.
    if ldata[1] in values:
        data[values.index(ldata[1])].append(ldata)
    elif ldata[1] not in values :
        values.append(ldata[1])
        data.append(ldata)

This, however, returns a list like this:

[['qwe', 'rty', 12, ['zxc', 'rty', 96]],
 ['asd', 'fgh', 34]]

What should I do to get

[[['qwe', 'rty', 12], ['zxc', 'rty', 96]],
 [['asd', 'fgh', 34]]] 

instead?

Upvotes: 0

Views: 56

Answers (2)

Eli Korvigo
Eli Korvigo

Reputation: 10493

If you don't want dictionaries, you can use groupby

from itertools import groupby
from operator import itemgetter

with open(...) as lines: 
    parsed_lines = map(lineprocess, lines) # I'm using your `lineprocess`
    second_item = itemgetter(1)
    groups = groupby(sorted(parsed_lines, key=second_item), second_item)
    result = [list(group) for predicate, group in groups]

This has O(nlogn) average case performance, which is better than your O(n^2). Still, a dictionary-based solution would be O(n).

Upvotes: 2

TheoretiCAL
TheoretiCAL

Reputation: 20571

data should contain a list of lists, not just lists.

f = open('input_1.txt')
values = [] #to keep track of which values have occured before
data = []
for line in f:
    ldata = lineprocess(line) #this transforms the raw data to [qwe, rty, 12] etc.
    if ldata[1] in values:
        data[values.index(ldata[1])].append(ldata)
    else:
        values.append(ldata[1])
        data.append([ldata])

Consider:

a = [1,2,3]
b = [4,5,6]
a.append(b)
print a # [1, 2, 3, [4, 5, 6]]
c = [[1,2,3]]
c.append(b)
print c # [[1, 2, 3], [4, 5, 6]]

Upvotes: 0

Related Questions