Samarth Ds
Samarth Ds

Reputation: 23

Python:create combination of two columns containing lists as their values in a python dataframe

I have a dataframe with lists in its columns and I am trying to figure out a way to find the combination of the two lists. The key thing is that the ID must be mapped to right combination -

df = pd.DataFrame([[1,['a','b','c'],['l','m']],[2,['d','e','f'],['n','o']]],columns = ['id','col1','col2'])

The result should be -

   id col1  col2
----------------
0   1   a   l
1   1   a   m
2   1   b   l
3   1   b   m
4   1   c   l
5   1   c   m
6   2   d   n
7   2   d   o
8   2   e   n
9   2   e   o
10  2   f   n
11  2   f   o

I am new to python and have tried exploring itertools library and its product functions, but I couldn't understand how to exactly get this format of the output.

Upvotes: 2

Views: 98

Answers (1)

Henry Yik
Henry Yik

Reputation: 22503

Use itertools.product with list comprehension to construct the combinations:

print (pd.DataFrame([(a,*x) for a,b,c in df.to_numpy() 
                     for x in product(b,c)],
                     columns=df.columns))

    id col1 col2
0    1    a    l
1    1    a    m
2    1    b    l
3    1    b    m
4    1    c    l
5    1    c    m
6    2    d    n
7    2    d    o
8    2    e    n
9    2    e    o
10   2    f    n
11   2    f    o

Alternatively, you can use unpacking if you do not want to explicitly, say, a, b, c, d, for the columns :

from itertools import product, chain

pd.DataFrame(chain.from_iterable(product([a], *rest) 
                                 for a, *rest in df.to_numpy()),
             columns=df.columns
             )

Upvotes: 2

Related Questions