dustin
dustin

Reputation: 373

Python: split list of strings to a list of lists of strings by length with a nested comprehensions

I've got a list of strings and I'm trying to make a list of lists of strings by string length.

i.e.

['a', 'b', 'ab', 'abc'] 

becomes

[['a', 'b'], ['ab'], ['abc']]

I've accomplished this like so:

lst = ['a', 'b', 'ab', 'abc']
lsts = []
for num in set(len(i) for i in lst):
    lsts.append([w for w in lst if len(w) == num])

I'm fine with that code, but I'm trying to wrap my head around comprehensions. I want to use nested comprehensions to do the same thing, but I can't figure out how.

Upvotes: 5

Views: 903

Answers (5)

Jon Clements
Jon Clements

Reputation: 142236

from itertools import groupby

mylist = ['a', 'b', 'ab', 'abc']
[list(vals) for key, vals in groupby(mylist, lambda L: len(L))]

note that since groupby only works on adjacent elements - you may need to force a sort on mylist with key=len)

  • returns an iterator with key (which will be length) and vals which is another iterator containing data in that key group.
  • then converts the iterator of data into a list
  • the outside list becomes built from the above

  • -

Upvotes: 0

Hugh Bothwell
Hugh Bothwell

Reputation: 56694

lst = ['a', 'b', 'ab', 'abc']
lst.sort(key=len) # does not make any change on this data,but
                  # all strings of given length must occur together


from itertools import groupby
lst = [list(grp) for i,grp in groupby(lst, key=len)]

results in

[['a', 'b'], ['ab'], ['abc']]

Upvotes: 1

Igor Chubin
Igor Chubin

Reputation: 64613

That is for all lengths from 1 to maximum (some of lists will be empty if there are no strings of that length in the a list):

>>> a = ['a', 'b', 'ab', 'abc']
>>> m = max(len(x) for x in a)
>>> print [[x for x in a if len(x) == i + 1] for i in range(m)]
[['a', 'b'], ['ab'], ['abc']]

But if you want to have only lists for the lengths that are in a you must use set(len(i) for i in lst) instead of range.

>>> print [[x for x in a if len(x) == i] for i in set(len(k) for k in a)]
[['a', 'b'], ['ab'], ['abc']]

There is no difference for the list ['a', 'b', 'ab', 'abc']. But if you change it a little bit, e.g so: [['a', 'b'], ['ab'], ['abcd']], you will see the difference:

>>> a = ['a', 'b', 'ab', 'abcd']
>>> print [[x for x in a if len(x) == i] for i in set(len(k) for k in a)]
[['a', 'b'], ['ab'], ['abcd']]

>>> print [[x for x in a if len(x) == i + 1] for i in range(max(len(x) for x in a))]
[['a', 'b'], ['ab'], [], ['abcd']]

Upvotes: 0

MartinStettner
MartinStettner

Reputation: 29174

L=['a','b','ab','abc']
result = [ [ w for w in L if len(w) == n] for n in set(len(i) for i in L)]

Upvotes: 0

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799450

>>> [[w for w in L if len(w) == num] for num in set(len(i) for i in L)]
[['a', 'b'], ['ab'], ['abc']]

Also, itertools is likely to be a little more efficient.

Upvotes: 4

Related Questions