How to replace repeated consecutive elements in a 2d numpy array with single element

Question

I have a numpy array of shape(1080,960)

[[0 0 255 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 255 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 255 255 ... 0 0 0]]

I want to output a numpy array that replaces the repeated values of both 0 and 255 with a single 0 and a single 255

The numpy array is a representation of a binary image that has pixels in the form BBBWWWWWWWBBBBWWW where B is black and W is white. I want to convert it into BWBW.

Example:

input:

[[0,0,0,255,255,255,0,0,0,0],
 [255,255,255,0,0,0,255,255,255],
 [0,0,255,0,0,255,0,0,255]]

output:

[[0,255,0],
 [255,0,255]
 [0,255,0,255,0,255]]

Brenlla · Accepted Answer

You cannot output a 2D numpy array because output rows may have different lengths. I would settle for a list of numpy arrays. So 1st let's generate some data:

img = np.random.choice([0,255], size=(1080, 960))

Then iterate over each row:

out=[]
for row in img:
    idx=np.ediff1d(row, to_begin=1).nonzero()[0]
    out.append(row[idx])

By taking the difference we are simply detecting where changes take place, and then using those indices idx to select the starting element in a consecutive streak. This solution is a bit simpler and faster than the the one by @DavidWinder (30 ms vs. 150 ms).

A fully vectorized solution can be a bit faster, but the code would be a bit complex. It would involve flattening arrays, raveling and unraveling indices... and applying np.split at the end, which is not a very fast operation because it involves creating a list. So I think this answer is good enough compromise between speed/code simplicity.

Edit #1

If the preferred output is an array padded with 0s at the end, it is better to create a zeros array and fill it with values of out list. First find out which row has more elements, and create array:

max_elms = np.max([len(x) for x in out])
arr = np.zeros((1080, max_elms), dtype=np.int32)

And then iterate over out list and arr, filling values of arr with the ones in out list:

for row, data in zip(arr, out):
    row[:len(data)] = data

How to replace repeated consecutive elements in a 2d numpy array with single element

Answers (2)

Edit #1

Related Questions