How to split tuple of tuples into columns

Question

I have a pandas dataframe where one column is a tuple with a nested tuple. The nested tuple has two existing ids. I want to explode every element in the total tuple into new appended columns. Here's my df so far:

df
  id1  id2   tuple_of_tuple
0 a    e    ('cat',100,('a','f'))
1 b    f    ('dog',100,('b','g'))
2 c    g    ('cow',100,('d','h'))
3 d    h    ('tree',100,('c','e'))

I was trying to implement the code below on a small subset of data, and it seemed to work. There were new appended columns with each extracted/exploded element where it needed to be.

df[['Link_1', 'Link_2','Link_3','Link_4']] = df['tuple_of_tuple'].apply(pd.Series)

But when I apply it on the entire dataset, I get the error "ValueError: Columns must be same length as key". (I should mention that there are a couple NaN's littered around, as in an entire entry in the row for the tuple_of_tuple column will just be NaN). How can I fix this?

cs95 · Accepted Answer

Here's an extremely elegant way to do it using python3.6's * unpacking operator:

df2 = pd.DataFrame(
    data=[[*i, *j] for *i, j in df.pop('tuple_of_tuple')], 
    columns=['link_1', 'link_2', 'link_3', 'link_4']
)

You can then link df2 with df using pd.concat:

pd.concat([df, df2], axis=1)

  id1 id2 link_1  link_2 link_3 link_4
0   a   e    cat     100      a      f
1   b   f    dog     100      b      g
2   c   g    cow     100      d      h
3   d   h   tree     100      c      e

How to split tuple of tuples into columns

Answers (1)

Related Questions