sudonym
sudonym

Reputation: 4018

One-liner for conditional Cartesian Product of list of strings with list of tuples in python

I have a list of strings and a list of tuples.

Input:

string_list = ['www.cars.com/BMW/' ,'www.cars.com/VW/']
tuple_list = [('BMW','green'), ('BMW','blue'), 
               ('VW','black'), ('VW','red'), ('VW','yellow')]

First step: For every key in string_list, I need to filter for matching key/value pair in tuple_list:

string_list = ['www.cars.com/BMW/']
tuple_list = [('BMW','green'), ('BMW','blue')]

Second step: In one final output list, I need to form the Cartesian product of all strings in string_list with every matching key/value pair in tuple_list:

Output:

results_list = ['www.cars.com/BMW/green','www.cars.com/BMW/blue', 
  'www.cars.com/VW/black''www.cars.com/VW/red','www.cars.com/VW/yellow']

My current approach uses a series of nested for-loops, which comes at the cost of being slow, ugly and too long.

How to efficiently form a conditional Cartesian Product between a list of strings and a list of tuples in python?

Upvotes: 0

Views: 130

Answers (3)

pylang
pylang

Reputation: 44545

If you pre-build a dictionary for lookups, you can improve performance a bit more:

Given

import collections as ct


colors =  ct.defaultdict(list)
for k, v in tuple_list:
    colors[k].append(v)

colors
# defaultdict(list, {'BMW': ['green', 'blue'], 'VW': ['black', 'red', 'yellow']})

Code

[s + c for s in string_list for c in colors[s[13:-1]]]

Output

['www.cars.com/BMW/green',
 'www.cars.com/BMW/blue',
 'www.cars.com/VW/black',
 'www.cars.com/VW/red',
 'www.cars.com/VW/yellow']

Performance

%timeit -n 100000 [s + b for s in string_list for a, b in tuple_list if a in s]  # @iBug
%timeit -n 100000 [s + c for s in string_list for c in colors[s[13:-1]]]         # proposed    
# 100000 loops, best of 3: 3.54 µs per loop
# 100000 loops, best of 3: 2.83 µs per loop

Upvotes: 1

Aaditya Ura
Aaditya Ura

Reputation: 12679

You can try :

string_list = ['www.cars.com/BMW/' ,'www.cars.com/VW/']
tuple_list = [('BMW','green'), ('BMW','blue'),
               ('VW','black'), ('VW','red'), ('VW','yellow')]


print([color+i[1] for i in tuple_list for color in string_list if i[0] in color])

output:

['www.cars.com/BMW/green', 'www.cars.com/BMW/blue', 'www.cars.com/VW/black', 'www.cars.com/VW/red', 'www.cars.com/VW/yellow']

Upvotes: 1

iBug
iBug

Reputation: 37287

One liner:

result = [s + b for s in string_list for a, b in tuple_list if a in s]

Basically, still two for loops.

>>> print(result)
['www.cars.com/BMW/green', 'www.cars.com/BMW/blue', 'www.cars.com/VW/black', 'www.cars.com/VW/red', 'www.cars.com/VW/yellow']

Upvotes: 4

Related Questions