Reputation: 754
Context and research
I am currently experimenting with pivot_table from the pandas library and am looking for a way to make a table in a somewhat specific way. The idea is to render a table with the values in rows according to an index, and in columns another variable.
After looking at several answers on the site related to grouping in pandas.pivot_table, however I haven't quite found one that could solve that issue so far.
Code and example
Since showing is better than telling, I am providing you a short code snippet to show what I have managed to do.
import pandas as pd
df = pd.DataFrame(
{
"Year": [2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019],
"Month": [4, 5, 6 , 7, 8, 9, 10, 4, 5, 6 , 7, 8, 9, 10],
"Apples": [3, 5, 1, 2, 5, 6, 6, 8, 9, 9, 1, 8, 6, 4],
"Temperature": [8.6, 13.7, 16.5, 18.5, 18.0, 13.5, 8.7, 9.2 , 14.2, 15.5, 20.1, 18.6, 12.8, 8.7]
}
)
pivoted_data = pd.pivot_table(df, index=['Month'], columns=['Year'], values=['Apples', 'Temperature']).transpose()
Result of pivoted_table:
Month 4 5 6 7 8 9 10
Year
Apples 2018 3.0 5.0 1.0 2.0 5.0 6.0 6.0
2019 8.0 9.0 9.0 1.0 8.0 6.0 4.0
Temperature 2018 8.6 13.7 16.5 18.5 18.0 13.5 8.7
2019 9.2 14.2 15.5 20.1 18.6 12.8 8.7
The best attempt I have had was to use the .transpose() method for dataframes and rendering the table as above, however this doesn't quite achieve what I am looking for.
Expected result
I am looking to render the data as follow:
Month 4 5 6 7 8 9 10
Year
2018 Temperature 8.6 13.7 16.5 18.5 18.0 13.5 8.7
Apples 3.0 5.0 1.0 2.0 5.0 6.0 6.0
2019 Temperature 8.0 9.0 9.0 1.0 8.0 6.0 4.0
Apples 9.2 14.2 15.5 20.1 18.6 12.8 8.7
Where the difference is subtle but quite important, since the data are displayed by year (and not by the values of apples and temperatures).
I have been unsuccessful in having some nice solution to the current problem. Would somebody have any idea how to render this small dataset in that fashion ?
Upvotes: 1
Views: 1779
Reputation: 862511
Use DataFrame.swaplevel
with DataFrame.sort_index
, also another solution for transpose is DataFrame.T
(less typing):
pivoted_data = (pd.pivot_table(df,
index='Month',
columns='Year',
values=['Apples', 'Temperature'])
.T
.swaplevel(1, 0)
.sort_index())
print (pivoted_data)
Month 4 5 6 7 8 9 10
Year
2018 Apples 3.0 5.0 1.0 2.0 5.0 6.0 6.0
Temperature 8.6 13.7 16.5 18.5 18.0 13.5 8.7
2019 Apples 8.0 9.0 9.0 1.0 8.0 6.0 4.0
Temperature 9.2 14.2 15.5 20.1 18.6 12.8 8.7
Upvotes: 5