Reputation: 29

Delete negative elements which are between positives only

a = [1, 3, 6, -2, 4, 5, 8, -3, 9,
     2, -5, -7, -9, 3, 6, -7, -6, 2]

I want to do like:

a = [1, 3, 6, 4, 5, 8, 9, 2,
     -5, -7, -9, 3, 6, -7, -6, 2]

which deletes only 4th and 8th elements, which are single negative elements between two positive elements.

import numpy as np

a = [1, 3, 6, -2, 4, 5, 8, -3, 9,
     2, -5, -7, -9, 3, 6, -7, -6, 2]

for i in range(len(a)):
    if a[i] < 0 and a[i - 1] > 0  and a[i + 1] > 0:
        np.delete(a[i])
print(a)

This did not work. Can I know where I have to fix?

Upvotes: 1

Answers (4)

constantstranger

Reputation: 9379

Because you ask about numpy in the subject line and also attempt to use np.delete() in your code, I assume you intend for a to be a numpy array.

Here is a way to do what your question asks using vectorized operations in numpy:

import numpy as np
a = np.array([1,3,6,-2,4,5,8,-3,9,2,-5,-7,-9, 3, 6, -7, -6, 2])
b = np.concatenate([a[1:], [np.NaN]])
c = np.concatenate([[np.NaN], a[:-1]])
d = (a<0)&(b>0)&(c>0)
print(a[~d])

Output:

[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]

What we've done is to shift a one to the left with NaN fill on the right (b) and one to the right with NaN fill on the left (c), then to create a boolean mask d using vectorized compare and boolean operators <, > and & which is True only where we want to delete single negative values sandwiched between positives. Finally, we use the ~ operator to flip the boolean value of the mask and use it to filter out the unneeded negative values in a.

UPDATE: Based on benchmarking of several possible strategies for answering your question (see below), the conclusion is that the following solution appears to be the most performative in answering OP's question (credit to @Kelly Bundy for suggesting this in a comment):

a = np.concatenate((a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)], a[-1:]))

UPDATE: Here are some timeit() comparisons of several variations on answers given for this question using NumPy 1.22.2.

The fastest of the 8 strategies is: a = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)], a[-1:]])

A close second is: a = a[np.concatenate([[True], ~((a[1:-1]<0)&(a[2:]>0)&(a[:-2]>0)), [True]])]

The strategies using np.r_(), either with np.delete() or with a boolean mask and [] syntax, are about twice as slow as the fastest.

The strategy using numpy.roll() is about 3 times as slow as the fastest. Note: As highlighted by in a comment by @Kelly Bundy, the roll() strategy in the benchmark does not give a correct answer to this question in all cases (though for the particular input example it happens to). I have nevertheless included it in the benchmark because the performance of roll() relative to concatenate() and r_() may be of general interest beyond the narrow context of this question.

Results:

foo_1 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_2 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_3 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_4 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_5 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_6 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_7 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
foo_8 output:
[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
Timeit results:
foo_1 ran in 1.2354546000715346e-05 seconds using 100000 iterations
foo_2 ran in 1.0962473000399769e-05 seconds using 100000 iterations
foo_3 ran in 7.733614000026136e-06 seconds using 100000 iterations
foo_4 ran in 7.751871000509709e-06 seconds using 100000 iterations
foo_5 ran in 5.856722998432815e-06 seconds using 100000 iterations
foo_6 ran in 7.5727709988132115e-06 seconds using 100000 iterations
foo_7 ran in 1.7790602000895887e-05 seconds using 100000 iterations
foo_8 ran in 5.435103999916464e-06 seconds using 100000 iterations

Code that generated the results:

import numpy as np

a = np.array([1,3,6,-2,4,5,8,-3,9,2,-5,-7,-9, 3, 6, -7, -6, 2])
from timeit import timeit
def foo_1(a):
    a = a if a.shape[0] < 2 else np.delete(a, np.r_[False, (a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0), False])
    return a
def foo_2(a):
    a = a if a.shape[0] < 2 else a[np.r_[True, ~((a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0)), True]]
    return a
def foo_3(a):
    b = np.concatenate([a[1:], [np.NaN]])
    c = np.concatenate([[np.NaN], a[:-1]])
    d = (a<0)&(b>0)&(c>0)
    a = a[~d]
    return a
def foo_4(a):
    a = a[~((a<0)&(np.concatenate([a[1:], [np.NaN]])>0)&(np.concatenate([[np.NaN], a[:-1]])>0))]
    return a
def foo_5(a):
    a = a if a.shape[0] < 2 else a[np.concatenate([[True], ~((a[1:-1]<0)&(a[2:]>0)&(a[:-2]>0)), [True]])]
    return a
def foo_6(a):
    a = a if a.shape[0] < 2 else np.delete(a, np.concatenate([[False], (a[1:-1]<0)&(a[2:]>0)&(a[:-2]>0), [False]]))
    return a
def foo_7(a):
    mask_bad = (
       (a < 0) &  # the value is < 0 AND
       (np.roll(a,1) >= 0) & # the value to the right is >= 0
       (np.roll(a,-1) >= 0) # the value to the left is >= 0
    )
    mask_good = ~mask_bad
    a = a[mask_good]
    return a
def foo_8(a):
    a = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)], a[-1:]])
    return a

foo_count = 8
for foo in ['foo_' + str(i + 1) for i in range(foo_count)]:
    print(f'{foo} output:')
    print(eval(f"{foo}(a)"))

n = 100000
print(f'Timeit results:')
for foo in ['foo_' + str(i + 1) for i in range(foo_count)]:
    t = timeit(f"{foo}(a)", setup=f"from __main__ import a, {foo}", number=n) / n
    print(f'{foo} ran in {t} seconds using {n} iterations')

Upvotes: 2

OTheDev

Reputation: 2967

A one-liner is

a[1:-1] = [a[i] for i in range(1, len(a) - 1) if not (a[i] < 0 and a[i-1] > 0 and a[i+1] > 0)]

The above assigns elements from the list a that are not negative and preceded and followed by a positive number to a slicing of a.

Output (from printing a)

[1, 3, 6, 4, 5, 8, 9, 2, -5, -7, -9, 3, 6, -7, -6, 2]

Timings

The below timings compare my approach above to the fastest approach in @constantstranger's answer:

a = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)], a[-1:]])

My suggested approach is obviously optimized for the case where you want both the input and output to be a list. However, even in suboptimal (for my approach) input/output configurations, for this input, my approach appears to be faster than the numpy approach.

Input/Output Configuration 1

Input is a list (as in your question).
Output is a numpy array.

In [3]: %%timeit
   ...: a = [1, 3, 6, -2, 4, 5, 8, -3, 9, 2, -5, -7, -9, 3, 6, -7, -6, 2]
   ...: a[1:-1] = [a[i] for i in range(1, len(a) - 1) if not (a[i] < 0 and a[i-
   ...: 1] > 0 and a[i+1] > 0)]
   ...: a = np.array(a)
   ...: 
   ...: 
3.08 µs ± 17.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [5]: %%timeit
   ...: a = np.array([1, 3, 6, -2, 4, 5, 8, -3, 9, 2, -5, -7, -9, 3, 6, -7, -6,
   ...:  2])
   ...: a = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)]
   ...: , a[-1:]])
   ...: 
   ...: 
6.66 µs ± 16.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Input/Output Configuration 2

Input and output is a numpy array (as assumed in other answers).

The input

a = np.array([1, 3, 6, -2, 4, 5, 8, -3, 9, 2, -5, -7, -9, 3, 6, -7, -6, 2])

Timings:

In [3]: %%timeit
   ...: b = a.tolist()
   ...: b[1:-1] = [b[i] for i in range(1, len(b) - 1) if not (b[i] < 0 and b[i-
   ...: 1] > 0 and b[i+1] > 0)]
   ...: b = np.array(b)
   ...: 
   ...: 
3.1 µs ± 10.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [5]: %%timeit
   ...: b = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)]
   ...: , a[-1:]])
   ...: 
   ...: 
4.8 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Remarks

The above holds for this specific input. A larger input size may have different results (particularly due to the conversion between types). I would be happy to provide timings that vary the input size (presented graphically). However, it would be useful to know whether you want the input or output to be a list or a numpy array.

Upvotes: 0

Mad Physicist

Reputation: 114390

A solution that handles edges correctly and doesn't create an unholy number of temporary arrays:

a = np.delete(a, np.r_[False, (a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0), False])

Alternatively, you can create the positive rather than the negative mask

a = a[np.r_[True, (a[1:-1] >= 0) | (a[:-2] <= 0) | (a[2:] <= 0), True]]

Since np.concatenate is faster than np.r_, you could rephrase the masks as

np.concatenate(([False], (a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0), [False])

and

np.concatenate(([True], (a[1:-1] >= 0) | (a[:-2] <= 0) | (a[2:] <= 0), [True]))

In some cases, you might get extra mileage out of applying np.where(...)[0] or np.flatnonzero to the mask. This works sometimes because it avoids having to recompute the size of the number of masked elements twice.

Upvotes: 1

fabda01

Reputation: 3763

Your conditional logic

if a[i] < 0 and a[i - 1] > 0 and a[i + 1] > 0

seems sound and readable to me. But it would have issues with the boundary cases:

[1, 2, -3] -> IndexError: list index out of range
[-1, 2, 3] -> [2, 3]

Handling it properly could be as simple as skipping the first and last element of you list with

for i in range(1, len(a) - 1)

Test

import numpy as np


def del_neg_between_pos(a):
    delete_idx = []
    for i in range(1, len(a) - 1):
        if a[i] < 0 and a[i - 1] > 0 and a[i + 1] > 0:
            delete_idx.append(i)

    return np.delete(a, delete_idx)


if __name__ == "__main__":
    a1 = [1, 3, 6, -2, 4, 5, 8, -3, 9, 2, -5, -7, -9, 3, 6, -7, -6, 2]
    a2 = [1, 2, -3]
    a3 = [-1, 2, 3]
    for a in [a1, a2, a3]:
        print(del_neg_between_pos(a))

Output

[ 1  3  6  4  5  8  9  2 -5 -7 -9  3  6 -7 -6  2]
[ 1  2 -3]
[-1  2  3]

Upvotes: 0

Delete negative elements which are between positives only

Answers (4)

Test

Output

Related Questions