Reputation: 2080
I have a DataFrame where I need to split the header into multiple rows as headers for the same Dataframe.
The dataframe looks like this,
My data Frame looks like follows,
gene ALL_ID_1 AML_ID_1 AML_ID_2 AML_ID_3 AML_ID_4 AML_ID_5 Stroma_ID_1 Stroma_ID_2 Stroma_ID_3 Stroma_ID_4 Stroma_ID_5 Stroma_CR_Pat_4 Stroma_CR_Pat_5 Stroma_CR_Pat_6 Stroma_CR_Pat_7 Stroma_CR_Pat_8
ENSG 8 1 11 5 10 0 628 542 767 578 462 680 513 968 415 623
ENSG 0 0 1 0 0 0 0 28 1 3 0 1 4 0 0 0
ENSG 661 1418 2580 6817 14727 5968 9 3 5 9 2 9 3 3 5 1
ENSG 20 315 212 8 790 471 1283 2042 1175 2839 1110 857 1880 1526 2262 2624
ENSG 11 26 24 9 11 2 649 532 953 463 468 878 587 245 722 484
And I want the the above header to be spitted as follows,
network ID ID REL
node B_ALL AML Stroma
hemi 1 1 2 3 4 5 1 2 3 4 5 6 7 8 9 10
ENSG 8 1 11 5 10 0 628 542 767 578 462 680 513 968 415 623
ENSG 0 0 1 0 0 0 0 28 1 3 0 1 4 0 0 0
ENSG 661 1418 2580 6817 14727 5968 9 3 5 9 2 9 3 3 5 1
ENSG 20 315 212 8 790 471 1283 2042 1175 2839 1110 857 1880 1526 2262 2624
ENSG 11 26 24 9 11 2 649 532 953 463 468 878 587 245 722 484
Any help is greatly appreciated ..
Upvotes: 0
Views: 1077
Reputation: 54340
Probably not the best minimal example you put here, very few people has the subject knowledge to understand what is network
, node
and hemi
in your context.
You just need to create your MultiIndex
and replace your column index with the one you created:
There are 3 rules in your example:
REL
, otherwise belongs to ID
.node
is the first field of the initial column nameshemi
is the last field of the initial column namesThen, just code away:
In [110]:
df.columns = pd.MultiIndex.from_tuples(zip(np.where(df.columns.str.find('Stroma')!=-1, 'REL', 'ID'),
df.columns.map(lambda x: x.split('_')[0]),
df.columns.map(lambda x: x.split('_')[-1])),
names=['network', 'node', 'hemi'])
print df
network ID REL \
node ALL AML Stroma
hemi 1 1 2 3 4 5 1 2 3 4 5
gene
ENSG 8 1 11 5 10 0 628 542 767 578 462
ENSG 0 0 1 0 0 0 0 28 1 3 0
ENSG 661 1418 2580 6817 14727 5968 9 3 5 9 2
ENSG 20 315 212 8 790 471 1283 2042 1175 2839 1110
ENSG 11 26 24 9 11 2 649 532 953 463 468
network
node
hemi 4 5 6 7 8
gene
ENSG 680 513 968 415 623
ENSG 1 4 0 0 0
ENSG 9 3 3 5 1
ENSG 857 1880 1526 2262 2624
ENSG 878 587 245 722 484
Upvotes: 2