Reputation: 529
I have a DataFrame similar to the following:
A B C D E F
0 1 (10, 11) (a, b) abc () ()
1 2 (10, 11) (a, b) def (2, 19) (j, k)
2 3 () () abc (73,) (u,)
where some columns contain tuples. How could I create a new row for each item in the tuples such that the result looks something like this?
A D B C E F
0 1 abc 10 a
1 11 b
2 2 def 10 a 2 j
3 11 b 19 k
4 3 abc 73 u
I know that columns B & C will always have the same number of elements, as will columns E and F.
Upvotes: 1
Views: 74
Reputation: 28253
using zip_longest from itertools. All single-values are wrapped in lists so that they can be zipped with the other lists (or tuples)
expanded = df.apply(
lambda x: pd.DataFrame.from_records(zip_longest([x.A], x.B, x.C, [x.D], x.E, x.F),
columns=list('ABCDEF')),
axis=1
).values
This creates an array of data frames, which then should be concatenated to get the desired result. Finally, the index should be reset to match the expected output.
df_expanded = pd.concat(expanded).reset_index(drop=True).
# df_expanded outputs:
A B C D E F
0 1.0 10 a abc None None
1 NaN 11 b None None None
2 2.0 10 a def 2 j
3 NaN 11 b None 19 k
4 3.0 None None abc 73 u
Upvotes: 3