Reputation: 4327
I've got a python dictionary of the form {'ip1:port1' : <value>, 'ip1:port2' : <value>, 'ip2:port1' : <value>, ...}
. Dictionary keys are strings, consisting of ip:port pairs. Values are not important for this task.
I need a list of ip:port
combinations with unique IP addresses, ports can be any of those that appear among original keys. For example above, two variants are acceptable: ['ip1:port1', ip2:port1']
and ['ip1:port2', ip2:port1']
.
What is the most pythonic way for doing it?
Currently my solution is
def get_uniq_worker_ips(workers):
wip = set(w.split(':')[0] for w in workers.iterkeys())
return [[worker for worker in workers.iterkeys() if worker.startswith(w)][0] for w in wip]
I don't like it, because it creates additional lists and then discards them.
Upvotes: 4
Views: 1534
Reputation: 55509
One way to do this is to transform your keys into a custom class that only looks at the IP part of the string when doing an equality test. It also needs to supply an appropriate __hash__
method.
The logic here is that the set
constructor will "see" keys with the same IP as identical, ignoring the port part in the comparison, so it will avoid adding a key to the set if a key with that IP is already present in the set.
Here's some code that runs on Python 2 or Python 3.
class IPKey(object):
def __init__(self, s):
self.key = s
self.ip, self.port = s.split(':', 1)
def __eq__(self, other):
return self.ip == other.ip
def __hash__(self):
return hash(self.ip)
def __repr__(self):
return 'IPKey({}:{})'.format(self.ip, self.port)
def get_uniq_worker_ips(workers):
return [k.key for k in set(IPKey(k) for k in workers)]
# Test
workers = {
'ip1:port1' : "val",
'ip1:port2' : "val",
'ip2:port1' : "val",
'ip2:port2' : "val",
}
print(get_uniq_worker_ips(workers))
output
['ip2:port1', 'ip1:port1']
If you are running Python 2.7 or later, the function can use a set comprehension instead of that generator expression inside the set()
constructor call.
def get_uniq_worker_ips(workers):
return [k.key for k in {IPKey(k) for k in workers}]
The IPKey.__repr__
method isn't strictly necessary, but I like to give all my classes a __repr__
since it can be handy during development.
Here's a much more succinct solution which is very efficient, courtesy of Jon Clements. It builds the desired list via a dictionary comprehension.
def get_uniq_worker_ips(workers):
return list({k.partition(':')[0]:k for k in workers}.values())
Upvotes: 4
Reputation: 82949
You can use itertools.groupby
to group by same IP addresses:
data = {'ip1:port1' : "value1", 'ip1:port2' : "value2", 'ip2:port1' : "value3", 'ip2:port2': "value4"}
by_ip = {k: list(g) for k, g in itertools.groupby(sorted(data), key=lambda s: s.split(":")[0])}
by_ip
# {'ip1': ['ip1:port1', 'ip1:port2'], 'ip2': ['ip2:port1', 'ip2:port2']}
Then just pick any one from the different groups of IPs.
{v[0]: data[v[0]] for v in by_ip.values()}
# {'ip1:port1': 'value1', 'ip2:port1': 'value3'}
Or shorter, making a generator expression for just the first key from the groups:
one_by_ip = (next(g) for k, g in itertools.groupby(sorted(data), key=lambda s: s.split(":")[0]))
{key: data[key] for key in one_by_ip}
# {'ip1:port1': 'value1', 'ip2:port1': 'value3'}
However, note that groupby
requires the input data to be sorted. So if you want to avoid sorting all the keys in the dict, you should instead just use a set
of already seen keys.
seen = set()
not_seen = lambda x: not(x in seen or seen.add(x))
{key: data[key] for key in data if not_seen(key.split(":")[0])}
# {'ip1:port1': 'value1', 'ip2:port1': 'value3'}
This is similar to your solution, but instead of looping the unique keys and finding a matching key in the dict for each, you loop the keys and check whether you've already seen the IP.
Upvotes: 7
Reputation: 4327
I've changed few characters in my solution and now am satisfied with it.
def get_uniq_worker_ips(workers):
wip = set(w.split(':')[0] for w in workers.iterkeys())
return [next(worker for worker in workers.iterkeys() if worker.startswith(w)) for w in wip]
Thanks to @Ignacio Vazquez-Abrams and @M.T. for explanations.
Upvotes: 0