Most efficient way to split a python 3 list of dicts into two by selecting entries with a particular value

Question

I want to select entries matching a certain value out of a list of dicts in python 3. This should result in two lists: a new list with the selected entries and the modified original list without them.

Scenario

Assume we have a list of dicts:

import random, sys, time

letters_1 = []
colors = ["red", "orange", "yellow", "green", "blue", "purple"]
for i in range(100000):
    letter = {"color": random.choice(colors), "content": random.randint(0, sys.maxsize)} 
    letters_1.append(letter)
letters_2 = list(letters_1)

We want to select all dicts with a certain value for a certain key, collect them into a new list and leave only the others in the initial list. This corresponds to how one would select all red-colored letters out of an actual stack of letters.

Possibilities

This can be done via list comprehension or via a for loop.

The problem with the list comprehension is that every one list comprehension only creates one list. I.e. in order to do what we want to do, we must go through the list twice: first copy the selected items into a new list, then remove the selected items the original list. To continue the script:

time_0 = time.time()
red_letters_1 = [letter for letter in letters_1 if letter["color"]=="red"]
letters_1 = [letter for letter in letters_1 if letter["color"]!="red"]
time_1 = time.time()

The problem with the for loop is that it leads to more convoluted code and that it (surprisingly) takes longer to execute:

time_2 = time.time()
red_letters_2 = []
other_letters_2 = []
for letter in letters_2:
    if letter["color"] == "red":
        red_letters_2.append(letter)
    else:
        other_letters_2.append(letter)
letters_2 = other_letters_2
time_3 = time.time()

print(time_1 - time_0)
print(time_3 - time_2)

Output:

0.011380434036254883
0.015761613845825195

Note: You can remove the need for having a second list other_letters_2 by going through the list backwards and using pop(), but this takes even longer (more than 10 times longer, actually).

Question

While the possibility with two list comprehensions is clearly the fastest of these possibilities, it seems inefficient to do two list comprehensions. Is it possible to fold this into one list comprehension (without making it inefficient)? Is there another more efficient way? Or is there a reason why it is not possible to speed things up beyond the possibility with two list comprehensions?

Note on related questions

The question has been suggested to be a duplicate of this thread, where the question is about selecting two subsets of a list using list comprehension (or for loops). In this case, the only way is to test two different conditions, which may perhaps be shortened at the expense of some readability by applying a nested list comprehension as suggested in this answer.

Since (1) this solution to the suggested duplicate is not an option for the present question and (2) (as pointed out by Ev. Kounis ) the present question potentially allows for different solutions by modifying the original list in the list comprehension, I submit that this is not a duplicate (not an exact one in any case). I clarified this also in the beginning of the question.

Python version: 3.6.2

Most efficient way to split a python 3 list of dicts into two by selecting entries with a particular value

Answers (1)

Related Questions