Reputation: 91
There are two lists
l1 = ['k1','k2','k3','k3','k4', 'k5']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9"]
I need to get these items as key:value
pair in a dictionary, along with highest value of duplicate elements
. Since dictionary holds the unique keys, one of the duplicate elements will be overridden.
print(dict(zip(l1, l2)))
{'k1': '1.2.3', 'k2': 'abc-2.3.4', 'k3': 'xyz-def-5.6.7', 'k4': 'ghjb-5.6.7', 'k5': '7.8.9'}
but from above output, i need highest value xyz-def-5.6.8
instead of xyz-def-5.6.7
Tried, print(list(zip(l1, l2)))
, output as below
[('k1', '1.2.3'), ('k2', 'abc-2.3.4'), ('k3', 'xyz-def-5.6.8'), ('k3', 'xyz-def-5.6.7'), ('k4', 'ghjb-5.6.7'), ('k5', '7.8.9')]
How do I achieve it ?
Is it possible to format this list of tuples or any other way to get desired ?
l1 = ['k1','k2','k3','k3','k4', 'k5', 'k6', 'k7', 'k6']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9", "1:2.3.4-3ubuntu0.1", "1.2.3-1.2build3", "1:2.3.4-3ubuntu0.2"]
These can't be same format across all the keys but it can be same across the certain duplicate keys, Say k6
has one format, k3
has another format.
Upvotes: 1
Views: 709
Reputation: 91
Since, format for duplicate key remains same and different for across other keys. Used below method to get the duplicated key details and then sorting on them, retaining highest value.
l1 = ['k1','k2','k3','k3','k4', 'k5', 'k6', 'k7', 'k6']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9", "1:2.3.4-3ubuntu0.1", "1.2.3-1.2build3", "1:2.3.4-3ubuntu0.2"]
l3 = {}
def get_duplicates_details(list_of_elems):
test = {}
for index, value in enumerate(list_of_elems):
if value in test:
test[value].append(index)
else:
test[value] = [index]
dictOfElems = {key: value for key, value in test.items() if len(value) > 1}
return dictOfElems
dictOfElems = get_duplicates_details(l1)
print(dictOfElems)
for index2, value2 in enumerate(l1):
if value2 in dictOfElems:
tmp = [l2[j] for j in dictOfElems[value2]]
tmp.sort()
l3[value2] = tmp[-1]
else:
l3[value2] = l2[index2]
print(l3)
Output:
{'k3': [2, 3], 'k6': [6, 8]}
{'k1': '1.2.3', 'k2': 'abc-2.3.4', 'k3': 'xyz-def-5.6.8', 'k4': 'ghjb-5.6.7', 'k5': '7.8.9', 'k6': '1:2.3.4-3ubuntu0.2', 'k7': '1.2.3-1.2build3'}
Upvotes: 0
Reputation: 51643
You need some way to "tell" your dict which value to choose if the key already exists - and it has to know how to decide between two values.
i need highest value xyz-def-5.6.8 instead of xyz-def-5.6.7
The provided function prioritize
implements that.
You could f.e. do this:
l1 = ['k1','k2','k3','k3','k4', 'k5']
l2 = ["1.2.3","abc-2.3.4","xyz-def-5.6.8", "xyz-def-5.6.7","ghjb-5.6.7","7.8.9"]
def prioritize(a,b):
"""Split the data by -, take the last, split it by . and convert to int tuple
for comparison reasons. Take either a or b depending wich is bigger."""
def extract(what):
"""Split into int tuples"""
return tuple(map(int, (what.split("-")[-1]).split(".")))
# 'xyz-def-5.6.8' => (5,6,8)
a_num = extract(a)
# 'xyz-def-5.5.7 => (5,5,7)
b_num = extract(b)
# int tuple comparison "just works"
return a if a_num > b_num else b
d = {}
for (k,v) in zip(l1,l2):
# maybe keep old value, else use new value
d[k] = prioritize(d.get(k,v), v)
print(d)
Output:
{'k1': '1.2.3',
'k2': 'abc-2.3.4',
'k3': 'xyz-def-5.6.8',
'k4': 'ghjb-5.6.7',
'k5': '7.8.9'}
Upvotes: 1