matthew_obrien
matthew_obrien

Reputation: 77

Iterating Through List of Lists, Maintaining List Structure

Say I have the following list of list of names:

names = [['Matt', 'Matt', 'Paul'], ['Matt']]

I want to return only the "Matts" in the list, but I also want to maintain the list of list structure. So I want to return:

[['Matt', 'Matt'], ['Matt']]

I've something like this, but this will append everthting together in one big list:

matts = [name for namelist in names for name in namelist if name=="Matt"]

I know something like this is possible, but I want to avoid iterating through lists and appending. Is this possible?

names = [['Matt', 'Matt', 'Paul'], ['Matt']]
matts = []
for namelist in names:
    matts_namelist = []
    for name in namelist:
        if name=="Matt":
            matts_namelist.append(name)
        else:
            pass
    matts.append(matts_namelist)
        

Upvotes: 1

Views: 768

Answers (4)

Dani Mesejo
Dani Mesejo

Reputation: 61910

Use a nested list comprehension, as below:

names = [['Matt', 'Matt', 'Paul'], ['Matt']]
res = [[name for name in lst if name == "Matt"] for lst in names]
print(res)

Output

[['Matt', 'Matt'], ['Matt']]

The above nested list comprehension is equivalent to the following for-loop:

res = []
for lst in names:
    res.append([name for name in lst if name == "Matt"])
print(res)

A third alternative functional alternative using filter and partial, is to do:

from operator import eq
from functools import partial

names = [['Matt', 'Matt', 'Paul'], ['Matt']]

eq_matt = partial(eq, "Matt")
res = [[*filter(eq_matt, lst)] for lst in names]
print(res)

Micro-Benchmark

%timeit [[*filter(eq_matt, lst)] for lst in names]
56.3 µs ± 519 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit [[name for name in lst if "Matt" == name] for lst in names]
26.9 µs ± 355 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Setup (of micro-benchmarks)

import random
population = ["Matt", "James", "William", "Charles", "Paul", "John"]
names = [random.choices(population, k=10) for _ in range(50)]

Full Benchmark

Candidates

def nested_list_comprehension(names, needle="Matt"):
    return [[name for name in lst if needle == name] for lst in names]


def functional_approach(names, needle="Matt"):
    eq_matt = partial(eq, needle)
    return [[*filter(eq_matt, lst)] for lst in names]


def count_approach(names, needle="Matt"):
    return [[needle] * name.count(needle) for name in names]

Plot Plot of alternative solutions

The above results were obtained for a list that varies from 100 to 1000 elements where each element is a list of 10 strings chosen at random from a population of 14 strings (names). The code for reproducing the results can be found here. As it can be seen from the plot the most performant solution is the one from @rv.kvetch.

Upvotes: 8

Wizard.Ritvik
Wizard.Ritvik

Reputation: 11612

An alternate way using list.count:

>>> names = [['Matt', 'Matt', 'Paul'], [], ['Matt']]
>>> [name.count('Matt') * ['Matt'] for name in names]
[['Matt', 'Matt'], [], ['Matt']]

You could also try with itertools.repeat:

>>> import itertools
>>> [[*itertools.repeat('Matt', name.count('Matt'))] for name in names]
[['Matt', 'Matt'], [], ['Matt']]

Lastly, as suggested by @DaniMensejo, you could also use the range iterator within a nested list comprehension:

>>> [['Matt' for _ in range(name.count('Matt'))] for name in names]
[['Matt', 'Matt'], [], ['Matt']]

Upvotes: 2

Mortz
Mortz

Reputation: 4879

Use the filter function -

matts = [list(filter(lambda x: x=='Matt', namelist)) for namelist in names]

Upvotes: 0

I'mahdi
I'mahdi

Reputation: 24049

IIUC, you can do this with a nested list like below:

>>> names = [['Matt', 'Matt', 'Paul'], ['Matt']]
>>> [[name for name in lst_name if name=='Matt'] for lst_name in names]
[['Matt', 'Matt'], ['Matt']]

Upvotes: 1

Related Questions