Reputation: 914
I have a pivoted Pandas DataFrame with the following columns:
month | day | hour | a | b | c | d | e | f | g ... z
1 1 1 3 9 0 9 0 3 3 9
What is the most efficient way to turn all rows in columns a
through z
into a list of their values, and use this new list column instead, in-place? The resulting columns would be:
month | day | hour | list
1 1 1 [3,9,0,9,0,3,3,9 ...]
I could iterate the rows and manually combine a
through z
into many lists, and then delete the unnecessary columns afterward, but there may be a more straightforward way.
Upvotes: 1
Views: 556
Reputation: 394279
Actually this is very simple, we can call the attribute .values
to return a np array of the df values, this has a method tolist()
, you can assign this directly to your new column:
In [258]:
import pandas as pd
import io
t="""month day hour a b c d e f g z
1 1 1 3 9 0 9 0 3 3 9"""
df = pd.read_csv(io.StringIO(t), sep='\s+')
df = pd.concat([df]*2, ignore_index=True)
df
Out[258]:
month day hour a b c d e f g z
0 1 1 1 3 9 0 9 0 3 3 9
1 1 1 1 3 9 0 9 0 3 3 9
In [264]:
df['list'] = df[df.columns[3:]].values.tolist()
df
Out[264]:
month day hour a b c d e f g z list
0 1 1 1 3 9 0 9 0 3 3 9 [3, 9, 0, 9, 0, 3, 3, 9]
1 1 1 1 3 9 0 9 0 3 3 9 [3, 9, 0, 9, 0, 3, 3, 9]
output from .values
:
In [265]:
df[df.columns[3:]].values
Out[265]:
array([[3, 9, 0, 9, 0, 3, 3, 9, [3, 9, 0, 9, 0, 3, 3, 9]],
[3, 9, 0, 9, 0, 3, 3, 9, [3, 9, 0, 9, 0, 3, 3, 9]]], dtype=object)
Upvotes: 1