armandino
armandino

Reputation: 18528

Python - slice array until certain condition is met

I need to slice an array from a given index until a certain condition is met.

>>> a = numpy.zeros((10), dtype='|S1')
>>> a[2] = 'A'
>>> a[4] = 'X'
>>> a[8] = 'B'
>>> a
array(['', '', 'A', '', 'X', '', '', '', 'B', ''], dtype='|S1')

For instance, for the above array I want a subset from a given index until first non-zero values in both directions. For example, for index values 2, 4, 8 the results would be:

['', '', A, '']      # 2
['', X, '', '', '']  # 4
['', '', '', B, '']  # 8

Any suggestions on the simplest way to do this using the numpy API? Learning python and numpy, would appreciate any help. Thanks!

Upvotes: 5

Views: 5592

Answers (5)

dagoof
dagoof

Reputation: 1139

Note that this could be cleanly done in pure python using itertools and functools.

import functools, itertools
arr = ['', '', 'A', '', 'X', '', '', '', 'B', '']

f = functools.partial(itertools.takewhile, lambda x: not x)
def g(a, i):
    return itertools.chain(f(reversed(a[:i])), [a[i]], f(a[i+1:]))

We define f as the sub-iterator found by looking until the element evaluates as true, and g as the combination of applying this on the reversed area of the list before the index and the list after the index.

This returns generators which can be casted to lists that contain our results.

>>> list(g(arr, 2))
['', '', 'A', '']
>>> list(g(arr, 4))
['', 'X', '', '', '']
>>> list(g(arr, 8))
['', '', '', 'B', '']

Upvotes: 2

Andrea Zonca
Andrea Zonca

Reputation: 8773

this is a work for masked arrays, numpy.ma has lots of functions for working with subsets.

a = np.zeros((10), dtype=str)
a[2] = 'A'
a[4] = 'X'
a[8] = 'B'

let's mask out not empty elements:

am=np.ma.masked_where(a!='', a)

np.ma.notmasked_contiguous goes through the array (very efficiently) and finds all the slices of contiguous elements where the array is not masked:

slices = np.ma.notmasked_contiguous(am)
[slice(0, 1, None), slice(3, 3, None), slice(5, 7, None), slice(9, 9, None)]

so, the array is continuously empty between element 5 and 7 for example. Now you only have to join the slices you are interested in, first you get the starting index of each slice:

slices_start = np.array([s.start for s in slices])

then you get the location of the index you are looking for:

slices_start.searchsorted(4) #4
Out: 2

So you want slice 1 and 2: a[slices[1].start:slices[2].stop+1] array(['', 'X', '', '', ''], dtype='|S1')

or let's try 8:

i = slices_start.searchsorted(8)
a[slices[i-1].start:slices[i].stop+1]
Out: array(['', '', '', 'B', ''], 
  dtype='|S1')

If should probably play a bit with this in ipython for understanding it better.

Upvotes: 6

Paul
Paul

Reputation: 43620

If you set up your problem like this:

import numpy
a = numpy.zeros((10), dtype=str)
a[2] = 'A'
a[4] = 'X'
a[8] = 'B'

You can easily get the indices of non-empty strings like so:

i = numpy.where(a!='')[0]  # array([2, 4, 8])

Alternatively, numpy.argwhere(..) also works well.

Then you can slice away using this array:

out2 = a[:i[1]]        # 2   ['' '' 'A' '']
out4 = a[i[0]+1:i[2]]  # 4   ['' 'X' '' '' '']

etc.

Upvotes: 7

pwdyson
pwdyson

Reputation: 1177

def getSlice(a, n):
    try:
        startindex = a[:n].nonzero()[0][-1]
    except IndexError:
        startindex = 0
    try:
        endindex = a[(n+1):].nonzero()[0][0] + n+1
    except IndexError:
        endindex = len(a)
    return a[startindex: endindex]

Upvotes: -2

Mike M. Lin
Mike M. Lin

Reputation: 10072

Two loops are the first thing that comes to mind. Something like this would work:

'''Given an array and an index...'''
def getNoneSlice(a, i):

    # get the first non-None index before i
    start = 0
    for j in xrange(i - 1, -1, -1):
        if a[j] is not None: # or whatever condition
            start = j + 1
            break

    # get the first non-None index after i
    end = len(a) - 1
    for j in xrange(i + 1, len(a)):
        if a[j] is not None: # or whatever condition
            end = j - 1
            break

    # return the slice
    return a[start:end + 1]

Upvotes: 0

Related Questions