How to pivot a dataframe so each column becomes a list in a cell?

Question

I'm trying to pivot this dataframe:

pd.DataFrame([[1, 4], [2, 5], [3, 6]], columns=['a', 'b'])

to this one:

pd.DataFrame([['a', [1, 2, 3]], ['b', [4, 5, 6]]], columns=['key', 'list'])

Ignoring the column renaming, is there a way to do it without iterating over the rows and converting them to a list and then a new column?

jezrael · Accepted Answer

Don't do this. Pandas was never designed to hold lists in series / columns. You can concoct expensive workarounds, but these are not recommended.

The main reason holding lists in series is not recommended is you lose the vectorised functionality which goes with using NumPy arrays held in contiguous memory blocks. Your series will be of object dtype, which represents a sequence of pointers, much like list. You will lose benefits in terms of memory and performance, as well as access to optimized Pandas methods.

See also What are the advantages of NumPy over regular Python lists? The arguments in favour of Pandas are the same as for NumPy.

But if really need it:

df1 = pd.DataFrame({'key': df.columns, 'list':[df[x].tolist() for x in df.columns]})
print (df1)
  key       list
0   a  [1, 2, 3]
1   b  [4, 5, 6]

How to pivot a dataframe so each column becomes a list in a cell?

Answers (1)

Related Questions