Reputation: 45
I have a 2D numpy array as follows:
start = np.array([
[1,1,0,1],
[1,0,0,1],
[0,1,0,0]
])
I need to get the same matrix, but replace each value with the number of neighbors to which I could get by moving by one step in any direction, but walking only along 1
As a result, I should get the follow:
finish = np.array([
[4,4,0,2],
[4,0,0,2],
[0,4,0,0]
])
It seems to me that this is a well-known problem, but I have not even figured out how to formulate it in search, since everything that I was looking for is a bit different. What's the best way to do this?
Upvotes: 3
Views: 569
Reputation: 763
You can use scipy.ndimage.label to label connected regions and return the number of regions as @Mr.T points out. This can than be used to create a boolean mask for indexing and counting.
Credits should go to @Mr.T as he came up with a similar solution first. This answer is still posted as the second part is different, I find it more readable and its 40% faster on my machine.
import numpy as np
from scipy.ndimage import label
a = [[1,1,0,1],
[1,0,0,1],
[0,1,0,0]])
# Label connected regions, the second arg defines the connection structure
labeled, n_labels = label(a, np.ones((3,3)))
# Replace label value with the size of the connected region
b = np.zeros_like(labeled)
for i in range(1, n_labels+1):
target = (labeled==i)
b[target] = np.count_nonzero(target)
print(b)
output:
[[4 4 0 2]
[4 0 0 2]
[0 4 0 0]]
Upvotes: 0
Reputation: 12410
You can use the scipy.ndimage
labeling function with a customized structure array s
:
import numpy as np
from scipy.ndimage import label
start = np.asarray([ [1,1,0,1],
[1,0,0,1],
[0,1,0,0] ])
#structure array what to consider as "neighbors"
s = [[1,1,1],
[1,1,1],
[1,1,1]]
#label blobs in array
labeledarr,_ = label(start, structure=s)
#retrieve blobs and the number of elements within each blobs
blobnr, blobval = np.unique(labeledarr.ravel(), return_counts=True)
#substitute blob label with the number of elements
finish = np.zeros_like(labeledarr)
for k, v in zip(blobnr[1:], blobval[1:]):
finish[labeledarr==k] = v
print(finish)
Output:
[[4 4 0 2]
[4 0 0 2]
[0 4 0 0]]
I am sure the final step of substituting the label number with the value of its occurrence can be optimized in terms of speed.
And @mad-physicist rightly mentioned that the initially used labeledarr.flat
should be substituted by labeledarr.ravel()
. The reasons for this are explained here.
Upvotes: 2