Reputation: 3233
I have a numpy array of source and destination ip's
consarray
array([['10.125.255.133', '104.244.42.130'],
['104.244.42.130', '10.125.255.133']], dtype=object)
The actual array is much larger than this.
I want to create a set of unique connection pairs from the array:
In the given eg: it is clear that both rows of the numpy array are part of same connection (Just src and destination are interchanged, so it is outgoing and incoming respectively).
I tried creating a set of unique tuples. like this:
conset = set(map(tuple,consarray))
conset
{('10.125.255.133', '104.244.42.130'), ('104.244.42.130', '10.125.255.133')}
What i actually want is for ('10.125.255.133', '104.244.42.130') and ('104.244.42.130', '10.125.255.133') to be considered the same so that only one of them will be in the set.
Can anyone tell me how do i go about doing this?
EDIT:
There have been some good answers, but actually i want another requirement,
I want that the first occurrence should always be the one retained irrespective of the ip address.
In the above example: ('10.125.255.133', '104.244.42.130') appears first, so it is the outgoing connection, i want to retain this.
If the above example changed to:
consarray
array(['104.244.42.130', '10.125.255.133']],
[['10.125.255.133', '104.244.42.130'],dtype=object)
I would want ('104.244.42.130', '10.125.255.133') to be retained.
Upvotes: 3
Views: 1023
Reputation: 78556
You could either apply sorting before making the tuples:
conset = set(map(lambda x: tuple(sorted(x)), consarray))
Or use fronzensets instead of tuples:
conset = set(map(frozenset, consarray))
To guarantee that the first item will be retained and the second not inserted, you could use a regular for
loop:
conset = set()
for x in consarray:
x = frozenset(x)
if x in conset:
continue
conset.add(x)
Upvotes: 3
Reputation: 142156
Since you're using numpy
, you can use numpy.unique
, eg:
a = np.array([('10.125.255.133', '104.244.42.130'), ('104.244.42.130', ' 10.125.255.133')])
Then np.unique(a)
gives you:
array(['10.125.255.133', '104.244.42.130'], dtype='<U14')
Upvotes: 1
Reputation: 2575
You can sort them first:
conset = set(map(tuple, map(sorted, consarray)))
print (conset)
gives:
{('10.125.255.133', '104.244.42.130')}
Upvotes: 1