dinn_
dinn_

Reputation: 93

Pandas add column header to row value and transform dataframe

I have an excel workbook with multiple sheets laid out like the following:

[ID]        [Date]              [TEST_A]       [TEST_B]       [TEST_C]
ID_1234     13/06/2017 11:00    1194.256258    1287.016744    1343.434
ID_1234     13/06/2017 12:00    1194.266828    1287.16688     1463.2352
...

Each sheet has a different value for the [ID] column. Some sheets go up to TEST_E and some only have TEST_A, they are not always in the same order either.

I am trying to transform the dataframe by creating a new ID_TEST column that joins the ID and each TEST columns, and transforming the rest of the columns to align with ID_TEST. Every sheet will then have four columns as follows, where the new Result column will have the values from each individual TEST column.

[ID]       [ID_TEST]           [Date]               [Result]
ID_1234    ID_1234-TEST_A      13/06/2017 11:00     1194.256258
ID_1234    ID_1234-TEST_A      13/06/2017 12:00     1194.266828
ID_1234    ID_1234-TEST_B      13/06/2017 11:00     1287.016744
ID_1234    ID_1234-TEST_B      13/06/2017 12:00     1287.1688
ID_1234    ID_1234-TEST_C      13/06/2017 11:00     1343.434
ID_1234    ID_1234-TEST_C      13/06/2017 12:00     1463.2352

I have tried starting with creating the [ID_TEST] column using this

df.apply(lambda col: col.name +" "+ col.astype(str) )

But this applies to every column and not the ones I want specifically. I think this will also need a pivot table or something to restructure the df but I am not sure how to implement it. Thanks.

Upvotes: 0

Views: 825

Answers (1)

BENY
BENY

Reputation: 323266

Try with melt

out = df.melt(['ID','Date'],var_name = 'ID_TEST',value_name='Result')
out['ID_TEST'] = out['ID'] + '-' + out['ID_TEST']

Upvotes: 2

Related Questions