Reputation: 13721
I have two arrays and I am trying to return a new array that equals the intersection of my original two arrays. The two original arrays should be of the same length. For example, if I have:
arr1 = np.array([(255, 255, 255), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
I should get:
intersectedArr = ([(255, 255, 255), (255, 255, 255])
However, if I have:
arr1 = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
I should get
([(255, 255, 255)])
So far i've tried:
intersectedArr = np.intersect1d(arr1, arr2)
but this returns [255]
instead of the expected ([(255, 255, 255)])
Can someone help? Thanks in advance!
Upvotes: 2
Views: 25496
Reputation: 11
In your case, you want to compare against rows instead of elements, so it`s a matter of 2D array. I would recommend an improvement of intersect1d which is intersection of 2D numpy arrays. I found a good solution here Intersection of 2D numpy ndarrays.
def multidim_intersect(arr1, arr2):
arr1_view = arr1.view([('',arr1.dtype)]*arr1.shape[1])
arr2_view = arr2.view([('',arr2.dtype)]*arr2.shape[1])
intersected = numpy.intersect1d(arr1_view, arr2_view)
return intersected.view(arr1.dtype).reshape(-1, arr1.shape[1])
The code above convert the shape of the original array, combine them row-wise, and then convert it back to 2-dim shape.
Upvotes: 0
Reputation: 81
If you want to keep duplicates, like in your examples, you can use a list comprehension:
def intersection(list_a, list_b):
return [ e for e in list_a if e in list_b ]
which produces:
in:
[(255, 255, 255), (255, 255, 255)]
[(255, 255, 255), (255, 255, 255)]
out:
[(255, 255, 255), (255, 255, 255)]
in:
[(100, 100, 100), (255, 255, 255)]
[(255, 255, 255), (255, 255, 255)]
out:
[(255, 255, 255)]
If you want uniquie combinations between the lists (sets) though:
def intersection(a, b):
return list(set(a).intersection(b))
which produces:
in:
[(255, 255, 255), (255, 255, 255)]
[(255, 255, 255), (255, 255, 255)]
out:
[(255, 255, 255)]
in:
[(100, 100, 100), (255, 255, 255)]
[(255, 255, 255), (255, 255, 255)]
out:
[(255, 255, 255)]
Cheers!
Upvotes: 7
Reputation: 375415
For larger arrays it might help to use pandas' groupby and cumcount:
In [11]: df1 = pd.DataFrame(arr1)
In [12]: df1["cumcount"] = df1.groupby([0, 1, 2]).cumcount()
In [13]: df1
Out[13]:
0 1 2 cumcount
0 100 100 100 0
1 255 255 255 0
In [14]: df2 = pd.DataFrame(arr2)
In [15]: df2["cumcount"] = df2.groupby([0, 1, 2]).cumcount()
In [16]: df2
Out[16]:
0 1 2 cumcount
0 255 255 255 0
1 255 255 255 1
Now a merge gets you the array you desire:
In [21]: df1.merge(df1).iloc[:, :3].values
Out[21]:
array([[100, 100, 100],
[255, 255, 255]])
In [22]: df1.merge(df2).iloc[:, :3].values
Out[22]: array([[255, 255, 255]])
In [23]: df2.merge(df2).iloc[:, :3].values
Out[23]:
array([[255, 255, 255],
[255, 255, 255]])
Upvotes: 0
Reputation: 3706
how about a numpy
answer?
import numpy as np
arr1 = np.array([(255, 255, 255), (255, 255, 25)]) # changed some to 25
arr2 = np.array([(255, 25, 255), (255, 255, 25)])
arr1[np.where(arr1==arr2)]
array([255, 255, 255, 255, 25])
2nd example
arr1 = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
arr1[np.where(arr1==arr2)]
array([255, 255, 255])
Upvotes: 3
Reputation: 15009
NOTE: This assumes [a, b, c]
and [b, c, a]
gives [a, b, c]
, that is the order of elements is ignored.
OK, I've done a little experimenting and this might be what you are after. Given:
arr1a = np.array([(255, 255, 255), (255, 255, 255)])
arr1b = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
Then we can find an intersection with:
np.array([item in arr2 for item in arr1a])
ie, for each element in arr1a
, check to see it appears in arr2
also. This gives a result of:
>>> array([ True, True], dtype=bool)
Similarly:
np.array([item in arr2 for item in arr1b])
>>> array([False, True], dtype=bool)
Now, we can use this result to pick the common values from the original lists:
mask = np.array([item in arr2 for item in arr1a])
arr1a[mask]
>>> array([[255, 255, 255],
[255, 255, 255]])
And:
mask = np.array([item in arr2 for item in arr1b])
arr1b[mask]
>>> array([[255, 255, 255]])
Upvotes: 0
Reputation: 64
Not sure how big your arrays will get, but if they remain fairly small, this could work:
import numpy as np
arr1 = np.array([(255, 255, 255), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
intersectedArr = []
for a1, a2 in zip(arr1, arr2):
if np.array_equal(a1, a2):
intersectedArr.append(a1)
print(np.array(intersectedArr))
arr1 = np.array([(100, 100, 100), (255, 255, 255)])
arr2 = np.array([(255, 255, 255), (255, 255, 255)])
intersectedArr = []
for a1, a2 in zip(arr1, arr2):
if np.array_equal(a1, a2):
intersectedArr.append(a1)
print(np.array(intersectedArr))
Upvotes: 3