Reputation: 1423
Say I have a scene parsing map for an image, each pixel in this scene parsing map indicates which object this pixel belongs to. Now I want to get bounding box of each object, how can I implement this in python? For a detail example, say I have a scene parsing map like this:
0 0 0 0 0 0 0
0 1 1 0 0 0 0
1 1 1 1 0 0 0
0 0 1 1 1 0 0
0 0 1 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
So the bounding box is:
0 0 0 0 0 0 0
1 1 1 1 1 0 0
1 0 0 0 1 0 0
1 0 0 0 1 0 0
1 1 1 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Actually, in my task, just know the width and height of this object is enough.
A basic idea is to search four edges in the scene parsing map, from top, bottom, left and right direction. But there might be a lot of small objects in the image, this way is not time efficient.
A second way is to calculate the coordinates of all non-zero elements and find the max/min x/y. Then calculate weight and height using these x and y.
Is there any other more efficient way to do this? Thx.
Upvotes: 3
Views: 9517
Reputation: 15071
using numpy:
import numpy as np
ind = np.nonzero(arr.any(axis=0))[0] # indices of non empty columns
width = ind[-1] - ind[0] + 1
ind = np.nonzero(arr.any(axis=1))[0] # indices of non empty rows
height = ind[-1] - ind[0] + 1
a bit more explanation:
arr.any(axis=0)
gives a boolean array telling you if the columns are empty (False
) or not (True
). np.nonzero(arr.any(axis=0))[0]
then extract the non zero (i.e. True
) indices from that array. ind[0]
is the first element of that array, hence the left most column non empty column and ind[-1]
is the last element, hence the right most non empty column. The difference then gives the width, give or take 1 depending on whether you include the borders or not.
Similar stuff for the height but on the other axis.
Upvotes: 3
Reputation: 735
If you are processing images, you can use scipy's ndimage library.
If there is only one object in the image, you can get the measurements with scipy.ndimage.measurements.find_objects (http://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.ndimage.measurements.find_objects.html):
import numpy as np
from scipy import ndimage
a = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
# Find the location of all objects
objs = ndimage.find_objects(a)
# Get the height and width
height = int(objs[0][0].stop - objs[0][0].start)
width = int(objs[0][1].stop - objs[0][1].start)
If there are many objects in the image, you first have to label each object and then get the measurements:
import numpy as np
from scipy import ndimage
a = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0]]) # Second object here
# Label objects
labeled_image, num_features = ndimage.label(a)
# Find the location of all objects
objs = ndimage.find_objects(labeled_image)
# Get the height and width
measurements = []
for ob in objs:
measurements.append((int(ob[0].stop - ob[0].start), int(ob[1].stop - ob[1].start)))
If you check ndimage.measurements, you can get more measurements: center of mass, area...
Upvotes: 10