Reputation: 605
I been trying to transform python pandas to from one representation to another but is really slow:
My current dataframe is like:
Column0, Column 1, Column 2, Column 3, Column 10...
p1 1 2 8 3
p2 2 4 9 6
Column0, sample, Exp
p1 Column 1 1
p1 Column 2 2
p1 Column 3 8
p1 Column 10 3
p2 Column 1 2
. . .
. . .
I use the iterrows
inserting into a new dataframe but is really slow. The Column 1,2 is not fixed but I have a collection with all the names.
Upvotes: 0
Views: 574
Reputation: 23099
IIUC, you can use melt
,
generally looping is discouraged in pandas unless there is no other option or you have a justifiable usecase.
d = """Column0, Column 1, Column 2, Column 3, Column 10
p1, 1, 2, 8, 3
p2, 2, 4, 9, 6"""
from io import StringIO
df = pd.read_csv(StringIO(d),sep=',')
df2 = pd.melt(df,id_vars=['Column0'],var_name='Sample',value_name='Exp')
print(df2)
Column0 Sample Exp
0 p1 Column 1 1
1 p2 Column 1 2
2 p1 Column 2 2
3 p2 Column 2 4
4 p1 Column 3 8
5 p2 Column 3 9
6 p1 Column 10 3
7 p2 Column 10 6
pd.melt(df,id_vars=['Column0'],var_name='Sample',value_name='Exp').rename(
columns = {'Column0' : 'NewCol',...})
Upvotes: 4