Reputation: 123
Probably really straightforward but I'm having no luck with Google. I have a 2 column dataframe of tuples, and I'm looking to unpack each tuple then pair up the contents from the same position in each column. For example:
Col1 Col2
(a,b,c) (d,e,f)
my desired output is
a d
b e
c f
I have a solution using loops but I would like to know a better way to do it - firstly because I am trying to eradicate loops from my life and secondly because it's potentially not as flexible as I may need it to be.
l1=[('a','b'),('c','d'),('e','f','g'),('h','i')]
l2=[('j','k'),('l','m'),('n','o','p'),('q','r')]
df = pd.DataFrame(list(zip(l1,l2)),columns=['Col1','Col2'])
df
Out[547]:
Col1 Col2
0 (a, b) (j, k)
1 (c, d) (l, m)
2 (e, f, g) (n, o, p)
3 (h, i) (q, r)
for i in range(len(df)):
for j in range(len(df.iloc[i][1])):
print(df.iloc[i][0][j], df.iloc[i][1][j])
a j
b k
c l
d m
e n
f o
g p
h q
i r
All pythonic suggestions and guidance hugely appreciated. Many thanks.
Addition: an example including a row with differing length tuples, per Ch3steR's request below - my loop would not work in this instance ('d2' would not be included, where I would want it to be outputted paired with a null).
l1=[('a','b'),('c','d','d2'),('e','f','g'),('h','i')]
l2=[('j','k'),('l','m'),('n','o','p'),('q','r')]
df = pd.DataFrame(list(zip(l1,l2)),columns=['Col1','Col2'])
Upvotes: 1
Views: 1294
Reputation: 59529
Send each Series tolist
and then reconstruct the DataFrame and stack
. Then concat
back together. This will leave you with a MultiIndex
with the first level being the original DataFrame index and the second level being the position in the tuple.
This will work for older versions of pandas pd.__version__ < '1.3.0'
and for instances where the tuples have an unequal number of elements (where explode
will fail)
import pandas as pd
df1 = pd.concat([pd.DataFrame(df[col].tolist()).stack().rename(col)
for col in df.columns], axis=1)
Col1 Col2
0 0 a j
1 b k
1 0 c l
1 d m
2 0 e n
1 f o
2 g p
3 0 h q
1 i r
Upvotes: 4
Reputation: 558
if the tuples length are always matching and you don't have the newer version of pandas to pass a list columns to explode
, do something like this:
import pandas as pd
pd.concat([df.Col1.explode(), df.Col2.explode()], axis=1).reset_index(drop=True)
Col1 Col2
0 a j
1 b k
2 c l
3 d m
4 e n
5 f o
6 g p
7 h q
8 i r
Upvotes: 3