pistacchio
pistacchio

Reputation: 58863

Temporary variable within list comprehension

It happens to me quite often to have a piece of code that looks like this.

raw_data  = [(s.split(',')[0], s.split(',')[1]) for s in all_lines if s.split(',')[1] != '"NaN"']

Basically, I'd like to know if there is a way to create a temporary variable like splitted_s in order to avoid having to repeat operations on the looped object (like, in this case, having to split it three times).

Upvotes: 23

Views: 12137

Answers (3)

Xavier Guihot
Xavier Guihot

Reputation: 61656

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the same expression:

In our case, we can name the evaluation of line.split(',') as a variable parts while using the result of the expression to filter the list if parts[1] is not equal to NaN; and thus re-use parts to produce the mapped value:

# lines = ['1,2,3,4', '5,NaN,7,8']
[(parts[0], parts[1]) for line in lines if (parts := line.split(','))[1] != 'NaN']
# [('1', '2')]

Upvotes: 20

myaut
myaut

Reputation: 11494

If you have two actions for processing, you may embed another list comprehension:

raw_data  = [(lhs, rhs) 
            for lhs, rhs 
            in [s.split(',')[:2] for s in all_lines]
            if rhs != '"NaN"']

You can use generator inside (it gives a small performance gain too):

            in (s.split(',')[:2] for s in all_lines)

It will even be faster than your implementation:

import timeit

setup = '''import random, string;
all_lines = [','.join((random.choice(string.letters),
                    str(random.random() if random.random() > 0.3 else '"NaN"')))
                    for i in range(10000)]'''
oneloop = '''[(s.split(',')[0], s.split(',')[1]) 
              for s in all_lines if s.split(',')[1] != '"NaN"']'''
twoloops = '''raw_data  = [(lhs, rhs) 
                for lhs, rhs 
                in [s.split(',') for s in all_lines]
                if rhs != '"NaN"']'''

timeit.timeit(oneloop, setup, number=1000)  # 7.77 secs
timeit.timeit(twoloops, setup, number=1000) # 4.68 secs

Upvotes: 15

Mariy
Mariy

Reputation: 5914

You can't.

A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.

From here

Assignment in Python is not an expression.

As Padraic Cunningham comments - if you need to split it multiple times don't do it in list comprehension.

Upvotes: 0

Related Questions