bwrabbit
bwrabbit

Reputation: 529

Converting DataFrame columns that contain tuples into rows

I have a DataFrame similar to the following:

   A         B       C    D          E        F
0  1  (10, 11)  (a, b)  abc         ()       ()
1  2  (10, 11)  (a, b)  def    (2, 19)   (j, k)
2  3        ()      ()  abc     (73,)      (u,)

where some columns contain tuples. How could I create a new row for each item in the tuples such that the result looks something like this?

   A         D      B       C       E       F
0  1        abc     10      a       
1                   11      b
2  2        def     10      a       2       j
3                   11      b       19      k
4  3        abc                     73      u

I know that columns B & C will always have the same number of elements, as will columns E and F.

Upvotes: 1

Views: 74

Answers (1)

Haleemur Ali
Haleemur Ali

Reputation: 28253

using zip_longest from itertools. All single-values are wrapped in lists so that they can be zipped with the other lists (or tuples)

expanded = df.apply(
    lambda x: pd.DataFrame.from_records(zip_longest([x.A], x.B, x.C, [x.D], x.E, x.F), 
                                        columns=list('ABCDEF')), 
    axis=1
).values

This creates an array of data frames, which then should be concatenated to get the desired result. Finally, the index should be reset to match the expected output.

df_expanded = pd.concat(expanded).reset_index(drop=True).
# df_expanded outputs:
     A     B     C     D     E     F
0  1.0    10     a   abc  None  None
1  NaN    11     b  None  None  None
2  2.0    10     a   def     2     j
3  NaN    11     b  None    19     k
4  3.0  None  None   abc    73     u

Upvotes: 3

Related Questions