oshi2016
oshi2016

Reputation: 955

How to pivot a dataframe so each column becomes a list in a cell?

I'm trying to pivot this dataframe:

pd.DataFrame([[1, 4], [2, 5], [3, 6]], columns=['a', 'b'])

to this one:

pd.DataFrame([['a', [1, 2, 3]], ['b', [4, 5, 6]]], columns=['key', 'list'])

Ignoring the column renaming, is there a way to do it without iterating over the rows and converting them to a list and then a new column?

Upvotes: 0

Views: 81

Answers (1)

jezrael
jezrael

Reputation: 863166

Don't do this. Pandas was never designed to hold lists in series / columns. You can concoct expensive workarounds, but these are not recommended.

The main reason holding lists in series is not recommended is you lose the vectorised functionality which goes with using NumPy arrays held in contiguous memory blocks. Your series will be of object dtype, which represents a sequence of pointers, much like list. You will lose benefits in terms of memory and performance, as well as access to optimized Pandas methods.

See also What are the advantages of NumPy over regular Python lists? The arguments in favour of Pandas are the same as for NumPy.

But if really need it:

df1 = pd.DataFrame({'key': df.columns, 'list':[df[x].tolist() for x in df.columns]})
print (df1)
  key       list
0   a  [1, 2, 3]
1   b  [4, 5, 6]

Upvotes: 1

Related Questions