Reputation: 286
Consider two sorted numpy
arrays:
import numpy as np
a = np.array([1,2,4,4,6,8,10,10,21])
b = np.array([3,3,4,6,10,18,22])
How do I: 1. Find the elements that appear in both lists, and 2. Remove only one instance of that occurrence from each list.
That is the output should be:
a = [1,2,4,8,10,21]
b = [3,3,18,22]
So even if there are duplicates, only one instance is removed. However if the lists are
c = np.array([1,2,4,4,6,8,10,10,10,21])
d = np.array([3,3,4,6,10,10,18,22])
I expect to obtain the new outputs:
c = [1,2,4,8,10,21]
d = [3,3,18,22]
which is the same as above. The difference is the number of 10's in the list. Each of the two 10's in list d
takes away one 10 each from c
leaving the same result.
This post was the closest match to my question, but it removed all instances of repeats from both lists.
Upvotes: 2
Views: 8143
Reputation: 53029
Here is a numpy approach:
import numpy as np
a = np.array([1,2,4,4,6,8,10,10,21])
b = np.array([3,3,4,6,10,18,22])
# join and sort (with Tim sort this should be O(n))
ab = np.concatenate([a,b])
i = ab.argsort(kind="stable")
abo = ab[i]
# mark 1st of each group of equal values
d = np.flatnonzero(np.diff(abo,prepend=abo[0]-1,append=abo[-1]+1))
# mark sorted total by origin (a -> False, b -> True)
ig = i>=len(a)
# compare origins of first and last of each group of equal values
# if they are different mark for deletion
dupl = ig[d[:-1]] ^ ig[d[1:]-1]
# finally, delete
ar = np.delete(a,i[d[:-1][dupl]])
br = np.delete(b,i[d[1:][dupl]-1]-len(a))
# inspect
ar
array([ 1, 2, 4, 8, 10, 21])
br
array([ 3, 3, 18, 22])
Upvotes: 0
Reputation: 16
Using for loops:
import numpy as np
a = np.array([1,2,4,4,6,8,10,10,21])
b = np.array([3,3,4,6,10,18,22])
for i, val in enumerate(a):
if val in b:
a = np.delete(a, np.where(a == val)[0][0])
b = np.delete(b, np.where(b == val)[0][0])
for i, val in enumerate(b):
if val in a:
a = np.delete(a, np.where(a == val)[0][0])
b = np.delete(b, np.where(b == val)[0][0])
print(a)
print(b)
Outputs:
[1,2,4,8,10,21]
[3,3,18,22]
Upvotes: 0
Reputation: 2492
I'm not 100% sure what you're looking to do based on the question, but I have been able to duplicate the output using the methods described.
import numpy as np
# List of b that are not in a
a = np.array([1,2,4,4,6,8,10,10,21])
b = np.array([3,3,4,6,10,18,22])
newb = [x for x in b if x not in a]
print(newb)
# REMOVE ONE DUPLICATED ELEMENT FROM LIST
import collections
counter=collections.Counter(a)
print(counter)
newa = list(a)
for k,v in counter.items():
if v > 1:
newa.remove(k)
print(newa)
Upvotes: 1
Reputation: 107287
You can find the indices of first occurences of intersecting items using np.searchsorted
as following and then remove them using np.delete()
function:
In [58]: intersect = a[np.in1d(a, b)]
In [59]: mask1 = np.searchsorted(a, intersect)
In [60]: mask2 = np.searchsorted(b, intersect)
In [61]: np.delete(a, mask1)
Out[61]: array([ 1, 2, 4, 8, 10, 21])
In [62]: np.delete(b, mask2)
Out[62]: array([ 3, 3, 18, 22])
Upvotes: 2
Reputation: 61910
You can use collections.Counter:
from collections import Counter
import numpy as np
a = np.array([1, 2, 4, 4, 6, 8, 10, 10, 21])
b = np.array([3, 3, 4, 6, 10, 18, 22])
ca = Counter(a)
cb = Counter(b)
result_a = sorted((ca - cb).elements())
result_b = sorted((cb - ca).elements())
print(result_a)
print(result_b)
Output
[1, 2, 4, 8, 10, 21]
[3, 3, 18, 22]
It returns the same result for (as expected):
a = np.array([1, 2, 4, 4, 6, 8, 10, 10, 10, 21])
b = np.array([3, 3, 4, 6, 10, 10, 18, 22])
Upvotes: 3
Reputation: 7509
If you don't mind the verbosity:
import numpy as np
a = np.array([1,2,4,4,6,8,10,10,21])
b = np.array([3,3,4,6,10,18,22])
common_values = set(a) & set(b)
a = a.tolist()
b = b.tolist()
for value in common_values:
a.remove(value)
b.remove(value)
a = np.array(a)
b = np.array(b)
Upvotes: 0