Reputation: 91
I have a dataframe that is structure as such:
Item FY20 FY21 FY22 ...
Case High Low Base
Multiple 1.2 2.3 3.4
Cash 1.1 1.4 1.2
I need the data to look like this:
Item Date Case Value
Cash FY20 High 1.1
Cash FY21 Low 1.4
Cash FY22 Base 1.2
So I essentially want to transform the data from wide format to a long format based on "Case", the "FY"s and the item.
I've already tried using multi indexes and messed around a bit with pd.pivot but i'm honestly stumped here.
Upvotes: 0
Views: 66
Reputation: 153500
IIUC, you can use this bit of code to reshape your dataframe:
df.set_index('Item')\ # move Item into dataframe index
.T\ # transpose dataframe
.rename_axis('Date')\ #rename index to Date
.reset_index()\ #move index into dataframe as column
.melt(['Date', 'Case']) #melt dataframe to get to long format
Output:
Date Case Item value
0 FY20 High Multiple 1.2
1 FY21 Low Multiple 2.3
2 FY22 Base Multiple 3.4
3 FY20 High Cash 1.1
4 FY21 Low Cash 1.4
5 FY22 Base Cash 1.2
Where df is:
Item FY20 FY21 FY22
0 Case High Low Base
1 Multiple 1.2 2.3 3.4
2 Cash 1.1 1.4 1.2
df.set_index('Item').T
Almost there,
Item Case Multiple Cash
FY20 High 1.2 1.1
FY21 Low 2.3 1.4
FY22 Base 3.4 1.2
df.set_index('Item').T.rename_axis('Date').reset_index()
Add rename_axis and reset_index to prepare dataframe for melt,
Item Date Case Multiple Cash
0 FY20 High 1.2 1.1
1 FY21 Low 2.3 1.4
2 FY22 Base 3.4 1.2
Lastly melt dataframe:
df.set_index('Item').T.rename_axis('Date').reset_index().melt(['Date', 'Case'])
Output:
Date Case Item value
0 FY20 High Multiple 1.2
1 FY21 Low Multiple 2.3
2 FY22 Base Multiple 3.4
3 FY20 High Cash 1.1
4 FY21 Low Cash 1.4
5 FY22 Base Cash 1.2
And, if you only want the "Cash" records, then use this
df_out = df.set_index('Item').T.rename_axis('Date').reset_index().melt(['Date', 'Case'])
df_out.query('Item == "Cash"')
Output:
Date Case Item value
3 FY20 High Cash 1.1
4 FY21 Low Cash 1.4
5 FY22 Base Cash 1.2
Upvotes: 0
Reputation: 30991
Let's start from creation of your source DataFrame:
df = pd.DataFrame(data=[
[ 'Item', 'FY20', 'FY21', 'FY22' ],
[ 'Case', 'High', 'Low', 'Base' ],
[ 'Multiple', 1.2, 2.3, 3.4 ],
[ 'Cash', 1.1, 1.4, 1.2 ]])
The result is:
0 1 2 3
0 Item FY20 FY21 FY22
1 Case High Low Base
2 Multiple 1.2 2.3 3.4
3 Cash 1.1 1.4 1.2
Then we have to:
To do this, run:
df2 = df.transpose()
df2.columns = df2.iloc[0].tolist()
df2.drop(index=0, inplace=True)
df2.rename(columns={'Item': 'Date'})
The result is:
Date Case Multiple Cash
1 FY20 High 1.2 1.1
2 FY21 Low 2.3 1.4
3 FY22 Base 3.4 1.2
And to get your result, run:
df2.melt(id_vars=['Date', 'Case'], value_vars=['Cash'],
var_name='Name', value_name='Value')
and you will receive:
Date Case Name Value
0 FY20 High Cash 1.1
1 FY21 Low Cash 1.4
2 FY22 Base Cash 1.2
Or maybe the result should include also melting of Multiple column? To achieve this, remove value_vars=['Cash']. This way melting will include all remaining columns (other than included in id_vars).
Upvotes: 2