WindCheck
WindCheck

Reputation: 426

Pair elements from two different lists

I have two lists:

listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']

and I want to pair items in the format any string same number, like so:

listC = [('a1', None),('a2', 'b2'),('a3', None),('a4', 'b4')]

I´ve tried itertools.zip_longest but I couldn´t get what I need:

>>>list(itertools.zip_longest(listA, listB)
[('a1', 'b2'), ('a2', 'b4'), ('a3', None), ('a4', None)]

Any suggestions how to get listC?

Upvotes: 1

Views: 1235

Answers (4)

pylang
pylang

Reputation: 44465

Given

import itertools as it


list_a = ["a1", "a2", "a3", "a4"]
list_b = ["b2", "b4"]

Code

pred = lambda x: x[1:]
res = [tuple(g) for k, g in it.groupby(sorted(list_a + list_b, key=pred), pred)]
res
# [('a1',), ('a2', 'b2'), ('a3',), ('a4', 'b4')]

list(zip(*it.zip_longest(*res)))
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]

Details

A flat, sorted list is grouped by the numbers of each string and yields grouped results according to the predicate. Note, if strings start with a single letter, the predicate should work for any digit, "a1", "b23", "c132", etc. If you are willing, you might also consider a trailing number regex as seen in @Ajax1234's answer.

As you discovered, itertools.zip_longest pads None to shorter sub-groups by default.

See Also

  • this post for more ideas on padding iterables
  • this post on how to use itertool.groupby
  • this post on natural sorting for a more robust predicate

Upvotes: 0

Aaditya Ura
Aaditya Ura

Reputation: 12669

You can try dict approach:

listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']

final_list={}
import itertools

for i in itertools.product(listA,listB):
    data,data1=list(i[0]),list(i[1])
    if data[1]==data1[1]:
        final_list[i[0]]=i
    else:
        if i[0] not in final_list:
            final_list[i[0]]=(i[0],None)

print(final_list.values())

output:

[('a2', 'b2'), ('a3', None), ('a4', 'b4'), ('a1', None)]

Upvotes: 0

jpp
jpp

Reputation: 164623

You can use a list comprehension with a ternary statement for this:

listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']

listB_set = set(listB)
listC = [(i, 'b'+i[1:] if 'b'+i[1:] in listB_set else None) for i in listA]

# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]

However, for clarity and performance, I would consider separating numeric and string data.

Upvotes: 0

Ajax1234
Ajax1234

Reputation: 71451

You can use iter with next:

listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
l = iter(listB)
listC = [(a, next(l) if i%2 != 0 else None) for i, a in enumerate(listA)] 

Output:

[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]

Edit: pairing by trailing number:

import re
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
d = {re.findall('\d+$', b)[0]:b for b in listB}
listC = [(i, d.get(re.findall('\d+$', i)[0])) for i in listA]

Output:

[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]

Upvotes: 3

Related Questions