Reputation: 23
I have a dataframe with lists in its columns and I am trying to figure out a way to find the combination of the two lists. The key thing is that the ID must be mapped to right combination -
df = pd.DataFrame([[1,['a','b','c'],['l','m']],[2,['d','e','f'],['n','o']]],columns = ['id','col1','col2'])
The result should be -
id col1 col2
----------------
0 1 a l
1 1 a m
2 1 b l
3 1 b m
4 1 c l
5 1 c m
6 2 d n
7 2 d o
8 2 e n
9 2 e o
10 2 f n
11 2 f o
I am new to python and have tried exploring itertools library and its product functions, but I couldn't understand how to exactly get this format of the output.
Upvotes: 2
Views: 98
Reputation: 22503
Use itertools.product
with list comprehension to construct the combinations:
print (pd.DataFrame([(a,*x) for a,b,c in df.to_numpy()
for x in product(b,c)],
columns=df.columns))
id col1 col2
0 1 a l
1 1 a m
2 1 b l
3 1 b m
4 1 c l
5 1 c m
6 2 d n
7 2 d o
8 2 e n
9 2 e o
10 2 f n
11 2 f o
Alternatively, you can use unpacking if you do not want to explicitly, say, a
, b
, c
, d
, for the columns :
from itertools import product, chain
pd.DataFrame(chain.from_iterable(product([a], *rest)
for a, *rest in df.to_numpy()),
columns=df.columns
)
Upvotes: 2