Reputation: 148
Example DataFrame:
>>> idx = pd.MultiIndex.from_arrays([['foo', 'foo', 'bar', 'bar'], ['one', 'two', 'one', 'two']])
>>> df = pd.DataFrame({'Col1': [('a', 'b'), 'c', 'd', 'e'], 'Col2': [('A', 'B'), 'C', 'D', 'E']}, index=index)
>>> print(df)
Col1 Col2
foo one (a, b) (A, B)
two c C
bar one d D
two e E
I want to transform the DataFrame by unpacking the row of tuples while keeping everything under its original index, resulting in something like this:
Col1 Col2
foo one 0 a A
1 b B
two 0 c C
bar one 0 d D
two 0 e E
I can unpack the tuples just fine, but I'm just having trouble figuring out how to re-insert the new rows into the DataFrame. This is an example of something I've already tried:
>>> unpacked = pd.DataFrame(df.loc['foo', 'one'].tolist(), index=df.columns).T
>>> print(unpacked)
Col1 Col2
0 a A
1 b B
>>> df.loc['foo', 'one'] = unpacked
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Program Files\Python37\lib\site-packages\pandas\core\indexing.py", line 190, in __setitem__
self._setitem_with_indexer(indexer, value)
File "C:\Program Files\Python37\lib\site-packages\pandas\core\indexing.py", line 645, in _setitem_with_indexer
value = self._align_frame(indexer, value)
File "C:\Program Files\Python37\lib\site-packages\pandas\core\indexing.py", line 860, in _align_frame
raise ValueError('Incompatible indexer with DataFrame')
ValueError: Incompatible indexer with DataFrame
It's obvious why this fails, but I'm not sure where to go from here. Is there a way to create a new MultiIndex level during this process that can handle an arbitrary amount of unpacked rows?
Upvotes: 1
Views: 211
Reputation: 862581
Use Series.explode
in list comprehension with concat
and then add new level by GroupBy.cumcount
:
df = pd.concat([df[x].explode() for x in df.columns], axis=1)
df = df.set_index(df.groupby(df.index).cumcount(), append=True)
print (df)
Col1 Col2
foo one 0 a A
1 b B
two 0 c C
bar one 0 d D
two 0 e E
Upvotes: 1