Andrius
Andrius

Reputation: 21118

Python creating tuple groups in list from another list

Let's say I have this data:

data = [1, 2, 3, -4, -5, 3, 2, 4, -2, 5, 6, -5, -1, 1]

I need it to be grouped in another list by tuples. One tuple consists of two lists. One for positive numbers, another for negative. And tuples should be created by checking what kind of number it is. Last negative number (I mean in a row that between negative numbers there were no positive ones) means, other numbers must go into another tuple and when it finds another last negative number, it should create another tuple.

So rules are these: All found numbers are being added into first tuple, when it finds negative number, it still adds it to that tuple, till it finds positive number (it means new tuple must be created).

I think it is easier to show, than to explain. After parsing data, the list should look like this:

l = [([1, 2, 3], [-4, -5]), ([3, 2, 4], [-2]), ([5, 6], [-5, -1]), ([1], [])]

I created a solution, but I wonder if it's quite optimal. Maybe it is possible to write a more elegant one (and I wonder about performance, is there some better way to write such parser with best possible performance:))?

def neighborhood(iterable):
    iterator = iter(iterable)
    prev = None
    item = iterator.next()  # throws StopIteration if empty.
    for next in iterator:
        yield (prev,item,next)
        prev = item
        item = next
    yield (prev,item,None)

l = []    
pos = []
neg = []
for prev, item, next in neighborhood(data):
    if item > 0:
        pos.append(item)
        if not next:
            l.append((pos, neg))
    else:
        neg.append(item)
        if next > 0:
            l.append((pos, neg))
            pos = []
            neg = []
        elif not next:
            l.append((pos, neg))

print l

P.S. if not next part I think can be used only once after main check.

Upvotes: 4

Views: 253

Answers (2)

Alex Riley
Alex Riley

Reputation: 176750

I'd use itertools.groupby to make a list of consecutive tuples containing positive/negative lists first, and then group into consecutive pairs. This can still be done in one pass through the list by taking advantage of generators:

from itertools import groupby, zip_longest

x = (list(v) for k,v in groupby(data, lambda x: x < 0))
l = list(zip_longest(x, x, fillvalue=[]))

This gives l as:

[([1, 2, 3], [-4, -5]), ([3, 2, 4], [-2]), ([5, 6], [-5, -1]), ([1], [])]

A couple of notes on the code above:

  • The initial grouping into positive/negative values is handed to groupby which should be reasonably performant (it's compiled code).

  • The zipping-a-generator method for grouping into pairs is a reasonably common idiom in Python. It's guaranteed to work since zip guarantees than an iterable is consumed from left to right.

  • In Python 2, use izip_longest.

Upvotes: 8

fl00r
fl00r

Reputation: 83680

You could go with O(n) solution which is much less beautiful than @ajcr one but should be more efficient.

def pos_neg(data):
  split = []
  for r in data:
    if len(split) == 0 or (r > 0 and len(split[-1][-1]) > 0):
      split.append(([], []))

    if r < 0:
      split[-1][-1].append(r)
    else:
      split[-1][-2].append(r)

  return split

data = [1, 2, 3, -4, -5, 3, 2, 4, -2, 5, 6, -5, -1, 1]
print pos_neg(data)
#=> [([1, 2, 3], [-4, -5]), ([3, 2, 4], [-2]), ([5, 6], [-5, -1]), ([1], [])]

Upvotes: 0

Related Questions