Reputation: 6764
I have two numpy arrays that are OpenCV convex hulls and I want to check for intersection without creating for loops or creating images and performing numpy.bitwise_and
on them, both of which are quite slow in Python. The arrays look like this:
[[[x1 y1]]
[[x2 y2]]
[[x3 y3]]
...
[[xn yn]]]
Considering [[x1 y1]] as one single element, I want to perform intersection between two numpy ndarrays. How can I do that? I have found a few questions of similar nature, but I could not figure out the solution to this from there.
Upvotes: 6
Views: 18601
Reputation: 19628
inspired by jiterrace's answer
I came across this post while working with Udacity deep learning class(trying to find the overlap between training and test data).
I am not familiar with "view" and found the syntax a bit hard to understand, probably the same when I try to communicate to my friends who think in "table". My approach is basically to flatten/reshape the ndarray of shape (N, X, Y) into shape (N, X*Y, 1).
print(train_dataset.shape)
print(test_dataset.shape)
#(200000L, 28L, 28L)
#(10000L, 28L, 28L)
1). INNER JOIN (easier to understand, slow)
import pandas as pd
%%timeit -n 1 -r 1
def multidim_intersect_df(arr1, arr2):
p1 = pd.DataFrame([r.flatten() for r in arr1]).drop_duplicates()
p2 = pd.DataFrame([r.flatten() for r in arr2]).drop_duplicates()
res = p1.merge(p2)
return res
inters_df = multidim_intersect_df(train_dataset, test_dataset)
print(inters_df.shape)
#(1153, 784)
#1 loop, best of 1: 2min 56s per loop
2). SET INTERSECTION (fast)
%%timeit -n 1 -r 1
def multidim_intersect(arr1, arr2):
arr1_new = arr1.reshape((-1, arr1.shape[1]*arr1.shape[2])) # -1 means row counts are inferred from other dimensions
arr2_new = arr2.reshape((-1, arr2.shape[1]*arr2.shape[2]))
intersected = set(map(tuple, arr1_new)).intersection(set(map(tuple, arr2_new))) # list is not hashable, go tuple
return list(intersected) # in shape of (N, 28*28)
inters = multidim_intersect(train_dataset, test_dataset)
print(len(inters))
# 1153
#1 loop, best of 1: 34.6 s per loop
Upvotes: -1
Reputation: 97271
you can use http://pypi.python.org/pypi/Polygon/2.0.4, here is an example:
>>> import Polygon
>>> a = Polygon.Polygon([(0,0),(1,0),(0,1)])
>>> b = Polygon.Polygon([(0.3,0.3), (0.3, 0.6), (0.6, 0.3)])
>>> a & b
Polygon:
<0:Contour: [0:0.60, 0.30] [1:0.30, 0.30] [2:0.30, 0.60]>
To convert the result of cv2.findContours to Polygon point format, you can:
points1 = contours[0].reshape(-1,2)
This will convert the shape from (N, 1, 2) to (N, 2)
Following is a full example:
import Polygon
import cv2
import numpy as np
from scipy.misc import bytescale
y, x = np.ogrid[-2:2:100j, -2:2:100j]
f1 = bytescale(np.exp(-x**2 - y**2), low=0, high=255)
f2 = bytescale(np.exp(-(x+1)**2 - y**2), low=0, high=255)
c1, hierarchy = cv2.findContours((f1>120).astype(np.uint8),
cv2.cv.CV_RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
c2, hierarchy = cv2.findContours((f2>120).astype(np.uint8),
cv2.cv.CV_RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
points1 = c1[0].reshape(-1,2) # convert shape (n, 1, 2) to (n, 2)
points2 = c2[0].reshape(-1,2)
import pylab as pl
poly1 = pl.Polygon(points1, color="blue", alpha=0.5)
poly2 = pl.Polygon(points2, color="red", alpha=0.5)
pl.figure(figsize=(8,3))
ax = pl.subplot(121)
ax.add_artist(poly1)
ax.add_artist(poly2)
pl.xlim(0, 100)
pl.ylim(0, 100)
a = Polygon.Polygon(points1)
b = Polygon.Polygon(points2)
intersect = a&b # calculate the intersect polygon
poly3 = pl.Polygon(intersect[0], color="green") # intersect[0] are the points of the polygon
ax = pl.subplot(122)
ax.add_artist(poly3)
pl.xlim(0, 100)
pl.ylim(0, 100)
pl.show()
Output:
Upvotes: 4
Reputation: 6764
So this is what I did to get the job done:
import Polygon, numpy
# Here I extracted and combined some contours and created a convex hull from it.
# Now I wanna check whether a contour acquired differently intersects with this hull or not.
for contour in contours: # The result of cv2.findContours is a list of contours
contour1 = contour.flatten()
contour1 = numpy.reshape(contour1, (int(contour1.shape[0]/2),-1))
poly1 = Polygon.Polygon(contour1)
hull = hull.flatten() # This is the hull is previously constructued
hull = numpy.reshape(hull, (int(hull.shape[0]/2),-1))
poly2 = Polygon.Polygon(hull)
if (poly1 & poly2).area()<= some_max_val:
some_operations
I had to use for loop, and this altogether looks a bit tedious, although it gives me expected results. Any better methods would be greatly appreciated!
Upvotes: 1
Reputation: 67063
You can use a view of the array as a single dimension to the intersect1d function like this:
def multidim_intersect(arr1, arr2):
arr1_view = arr1.view([('',arr1.dtype)]*arr1.shape[1])
arr2_view = arr2.view([('',arr2.dtype)]*arr2.shape[1])
intersected = numpy.intersect1d(arr1_view, arr2_view)
return intersected.view(arr1.dtype).reshape(-1, arr1.shape[1])
This creates a view of each array, changing each row to a tuple of values. It then performs the intersection, and changes the result back to the original format. Here's an example of using it:
test_arr1 = numpy.array([[0, 2],
[1, 3],
[4, 5],
[0, 2]])
test_arr2 = numpy.array([[1, 2],
[0, 2],
[3, 1],
[1, 3]])
print multidim_intersect(test_arr1, test_arr2)
This prints:
[[0 2]
[1 3]]
Upvotes: 17