Amistad
Amistad

Reputation: 7410

Adding extra statements in a python list comprehension

I have a requirement to find a line in a text file which contains a specific string and then append that line and all lines that follow it to a list. This is the way I accomplished it..

file1 = open("input.txt", "r")
always_print = False
lines = file1.readlines()
output = []
for line in lines:
    if always_print or "def set" in line:  #def set is the string i want
        output.append(line)
        always_print = True

While this works fine,I tried doing the same using list comprehensions.This is what i got :

lines = [ item.strip() for item in open("input.txt")]
always_print = False
output = [item for item in lines if "def set" or print_always in item]

This obviously does not work as I don't set the always_print=True when the desired string is found.How do i do that within the list comprehension ??

Upvotes: 2

Views: 1403

Answers (3)

zehnpaard
zehnpaard

Reputation: 6243

Edit Correcting from my previous completely incorrect answer:

Not that I would use this in production code, but hacking away at it, you can do it in a list comprehension...

always_print = []
output = [item for item in lines 
          if always_print
          or (always_print.append(1) if ("def set" in item) else None) 
          or "def set" in item]

Just to explain, the second of three conditions always returns None which is a Falsy value.

Slipping further into the darkside, if you didn't want to evaluate "def set" in item twice:

always_print = []
output = [item for item in lines 
          if always_print
          or (
              (always_print.append(1) or True) if ("def set" in item) else False
             )]

Edit2

If I describe what's going on here in more detail...

The code above (and Kevin's, mentioned in his comment) uses three separate tricks.

  1. In Python, most objects and values have an associated Boolean. e.g. 0 is False, any other number is True. In the same vein, an empty list is False while a non-empty list is evaluated as True.
  2. While variable assignment like a=1 does not return any value and cannot be included as part of a list comprehension, a_list.append(x) is a function call and returns None which is evaluated as False. It also has the side-effect of adding the new element x to the end of a_list.
  3. Logical operators like and and or have evaluation order from left to right. and stops execution at the first False value and or stops at the first True value, which can be used to control whether the list appending gets executed or not based on certain conditions. The ternary operator 'x if y else z' also has evaluation order, but evaluates 'y' first then either 'x' or 'z' but never both.

As you can see, a set of very roundabout logic tricks that may have their place in ultra-optimized C (or the IOCCC), but not Python. It's possible to replicate your control flow within a list comprehension, but in practical terms, use dropwhile every time.

Upvotes: 3

Lanting
Lanting

Reputation: 3068

Drop while is probably the best solution. But if you want something really fancy, have a look at the following:

Using list comprehensions you can use simple checks and transformations, but you should see a list comprehension more like a mathematical mapping. You could, however, give the function you use to test if an item should be included some kind of state i.e. make your check remember if something has already occurred. Make it a functor (or function object):

class Drop_before:
  def __init__(self, val):
    self.val = val
    self.always_print = False
  def __call__(self, current_val):
    if current_val == self.val:
      self.always_print = True
    return self.always_print

drop_before_6 = Drop_before(6)
print [x for x in xrange(10) if drop_before_6(x)]
#using filter
print filter(Drop_before(4), xrange(10))

which outputs:

[6, 7, 8, 9]
[4, 5, 6, 7, 8, 9]

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1124788

Use itertools.dropwhile() to find the first line that contains your def set:

from itertools import dropwhile

output = list(dropwhile(lambda l: 'def set' not in l, lines))

dropwhile() will skip any entry in lines that doesn't match your test; as soon as it matches it stops testing and simply yields everything from there on out.

dropwhile() returns an iterator; I used list() here to convert it to a list of lines, but you could use it as a basis for another loop too, like stripping of newlines, etc.

Demo:

>>> from itertools import dropwhile
>>> lines = '''\
... foo
... bar
... def set():
...     spam
...     ham
...     eggs
... '''.splitlines(True)
>>> list(dropwhile(lambda l: 'def set' not in l, lines))
['def set():\n', '    spam\n', '    ham\n', '    eggs\n']

Upvotes: 8

Related Questions