jbplasma
jbplasma

Reputation: 385

Separate groups of array elements separated by Nones

I have an np.array(<>,dtype = object) of floats that are visually grouped by Nones between them. For example:

Event = np.array([1., 2., None, None, None, 32., 43., None], dtype = object)

There's no regular group-size so that it can be indexed like Event[:2:], or something of that sort. I want to separate these groups of floats into their own arrays/lists, so that I have:

Event_group1 = [1., 2.]
Event_group2 = [32., 43.]

The only solution I could think of would be to recursively sift through each element, store all floats into another list, stop storing elements once a None is reached, and then start storing elements into a new list once a float is reached again. But this is incredibly inefficient.

How would I do this in a Pythonic way?

Can't do much with the object np.array, so to start I can turn this into a boolean np.array with:

Event_bool = Event > 0

Upvotes: 0

Views: 42

Answers (1)

Kraigolas
Kraigolas

Reputation: 5590

Vanilla Solution

Thanks to @hilberts_drinking_problem's comment, we can see a builtin solution using itertools is as follows:

from itertools import groupby
x = [list(gp) for b, gp in groupby(Event, key=lambda x: x == None) if not b]
# [[1.0, 2.0], [32.0, 43.0]]

Another Solution

Use more_itertools:

import numpy as np
from more_itertools import split_at
Event = np.array([1., 2., None, None, None, 32., 43., None], dtype = object)
x = [group for group in split_at(Event, lambda x : x == None) if group]
# [[1.0, 2.0], [32.0, 43.0]]

Here, we split the list at Nones, which splits for each None. Thus we add if group to filter out empty lists from the split.

Performance Comparison

Note that the vanilla solution is also slightly faster than the more_itertools version:

import timeit
timeit.timeit("x = [group for group in split_at(Event, lambda x : x == None) if group]",
              setup="""
import numpy as np
from more_itertools import split_at
Event = np.array([1., 2., None, None, None, 32., 43., None], dtype = object)
""", number = 1000000)
# 2.097 seconds

timeit.timeit("x = [list(gp) for b, gp in groupby(Event, key=lambda x: x == None) if not b]",
              setup="""
import numpy as np
from itertools import groupby
Event = np.array([1., 2., None, None, None, 32., 43., None], dtype = object)
""", number = 1000000)
# 1.824 seconds

Upvotes: 2

Related Questions