Python: create combinations of two columns containing lists as their value in a dataframe

Question

I have a dataframe with lists in its columns and I am trying to figure out most efficient way to find the combination of the two lists -

df = pd.DataFrame([[['a','b','c'],['l','m']],[['d','e','f'],['n','o']]],columns = ['col1','col2'])

Now the output in this case would be -

     col1   col2
0   [a, l]  [a, m]
1   [b, l]  [b, m]
2   [c, l]  [c, m]
3   [d, n]  [d, o]
4   [e, n]  [e, o]
5   [f, n]  [f, o]

I tried iterating through each row and then apply itertools.combinations. But it's crashing my system for higher number of rows in the dataframe. Can you please suggest me any efficient way to do this? Thanks in advance.

Henry Yik · Accepted Answer

You can also use itertools.product with numpy.reshape:

from itertools import product

print (pd.DataFrame(np.reshape([list(product(a,b))
                                for a,b in df.to_numpy()],
                               (-1,2,2)).tolist()))

        0       1
0  [a, l]  [a, m]
1  [b, l]  [b, m]
2  [c, l]  [c, m]
3  [d, n]  [d, o]
4  [e, n]  [e, o]
5  [f, n]  [f, o]

Python: create combinations of two columns containing lists as their value in a dataframe

Answers (2)

Related Questions