MHS
MHS

Reputation: 2350

Python - Merging 2 lists of tuples by checking their values

I have lists like this:

a = [('JoN', 12668, 0.0036), ('JeSsIcA', 1268, 0.0536), ('JoN', 1668, 0.00305), ('King', 16810, 0.005)]
b = [('JoN', 12668, 0.0036), ('JON', 16680, 0.00305), ('MeSSi', 115, 0.369)]

I want the resultant list to be like:

result = [(('JoN', 12668, 0.0036), ('JoN', 12668, 0.0036)), (('JoN', 1668, 0.00305), ('JON', 16680, 0.00305)), (('King', 16810, 0.005), None), (None, ('MeSSi', 115, 0.369))]

I have tried nested loops, sets, map, zip but failed to achieve this output. kindly help me out.

Upvotes: 0

Views: 61

Answers (2)

ely
ely

Reputation: 77414

from string import lower
from itertools import groupby
from operator import itemgetter

def compose(f, g):
    def h(*args, **kwargs):
        return f(*g(*args, **kwargs))
    return h

def lower_first(*args):
    return (lower(args[0]),) + args[1:]

sorting_key = compose(lower_first, itemgetter(0, 2, 1))

grouping_key = compose(lower_first, itemgetter(0, 2))

output = [tuple(v) for k,v in groupby(sorted(a+b, key=sorting_key), 
                                      key=grouping_key)]

gives output as

[(('JeSsIcA', 1268, 0.0536),),
 (('JoN', 1668, 0.00305), ('JON', 16680, 0.00305)),
 (('JoN', 12668, 0.0036), ('JoN', 12668, 0.0036)),
 (('King', 16810, 0.005),),
 (('MeSSi', 115, 0.369),)]

Then adding the None values is easy:

final_output = [ elem if len(elem) >= 2 
    else ((None,)+ elem) if elem[0] not in a else elem + (None,) 
    for elem in output
]

which gives:

[(('JeSsIcA', 1268, 0.0536), None),
 (('JoN', 1668, 0.00305), ('JON', 16680, 0.00305)),
 (('JoN', 12668, 0.0036), ('JoN', 12668, 0.0036)),
 (('King', 16810, 0.005), None),
 (None, ('MeSSi', 115, 0.369))]

But you need to be careful, because stating a problem like this with lists often glosses over problems of relational joins that would be taken care of by a system with proper indexing, like a pandas.DataFrame which seems more likely to be the kind of data structure you want, due to its native join and merge capabilities.

Upvotes: 0

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250931

Convert a and b to dictionaries first using the first(use str.lower() in it) and third item as key and then later on loop on the union of the keys in a list comprehension to get the desired output:

>>> from pprint import pprint
>>> dct_a = {(x[0].lower(), x[2]): x for x in a}
>>> dct_b = {(x[0].lower(), x[2]): x for x in b}
>>> out = [(dct_a.get(k), dct_b.get(k)) for k in set(dct_a).union(dct_b)]
>>> pprint(out)
[(('JoN', 12668, 0.0036), ('JoN', 12668, 0.0036)),
 (('JoN', 1668, 0.00305), ('JON', 16680, 0.00305)),
 (('King', 16810, 0.005), None),
 (('JeSsIcA', 1268, 0.0536), None),
 (None, ('MeSSi', 115, 0.369))]

Upvotes: 2

Related Questions