ypriverol
ypriverol

Reputation: 605

Pandas transformation really slow

I been trying to transform python pandas to from one representation to another but is really slow:

My current dataframe is like:

Column0, Column 1, Column 2, Column 3, Column 10... 
   p1        1           2         8          3
   p2        2           4         9          6

Column0,   sample,       Exp
   p1       Column 1      1 
   p1       Column 2      2
   p1       Column 3      8
   p1       Column 10     3
   p2       Column 1      2 
   .           .          .
   .           .          . 

I use the iterrows inserting into a new dataframe but is really slow. The Column 1,2 is not fixed but I have a collection with all the names.

Upvotes: 0

Views: 574

Answers (1)

Umar.H
Umar.H

Reputation: 23099

IIUC, you can use melt,

generally looping is discouraged in pandas unless there is no other option or you have a justifiable usecase.

d = """Column0, Column 1, Column 2, Column 3, Column 10 
   p1,        1,           2,         8,          3
   p2,        2,           4,         9,          6"""


from io import StringIO

df = pd.read_csv(StringIO(d),sep=',')

df2 = pd.melt(df,id_vars=['Column0'],var_name='Sample',value_name='Exp')

print(df2)

  Column0       Sample  Exp
0      p1     Column 1    1
1      p2     Column 1    2
2      p1     Column 2    2
3      p2     Column 2    4
4      p1     Column 3    8
5      p2     Column 3    9
6      p1   Column 10     3
7      p2   Column 10     6

Chaining Operations

pd.melt(df,id_vars=['Column0'],var_name='Sample',value_name='Exp').rename(
             columns = {'Column0' : 'NewCol',...})

Upvotes: 4

Related Questions