Reputation: 2533
I'm trying to create a table of data using the Table()
function in Plotly
.
My data is as follows:
import pandas as pd
test_df = pd.DataFrame({'Manufacturer':['Mercedes', 'Buick', 'Ford', 'Buick', 'Buick', 'Ford', 'Buick', 'Chrysler', 'Ford', 'Buick', 'Chrysler', 'Ford', 'Buick', 'Ford', 'Ford', 'Chrysler', 'Chrysler', 'Ford', 'Chrysler', 'Chrysler', 'Chrysler', 'Buick'],
'Metric':['MPG', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score', 'Score'],
'Statistic':['External', 'Min', 'Max', 'Average', 'Median', '90th', '95th', '99th', 'Min', 'Max', 'Average', 'Median', '90th', '95th', '99th','Min', 'Max', 'Average', 'Median', '90th', '95th', '99th'],
'Value':[22, 3.405, 100.29, 4.62, 4.425, 5.34, 5.83, 7.75, 2.6323, 210, 4.193, 3.28, 5.04, 6.36, 11.01, 3.72, 43, 4.98, 4.82, 5.775, 6.18, 7.182],
})
I want to be able to create a table that looks like as follows:
Manufacturer Min Max Average Median 90th 95th 99th
Buick 3.405 210 4.62 4.425 5.04 5.83 7.182
Chrysler 3.72 43 4.193 4.82 5.775 6.18 7.75
Ford 2.6323 100.29 4.98 3.28 5.34 6.36 11.01
The code to do so looks something as follows (when hard-coded):
import plotly.graph_objects as go
go.Figure(go.Table(
header=dict(
values=["Manufacturer", "Min", "Max",
"Average", "Median", "90th",
"95th", "99th"],
font=dict(size=10),
align="left"
),
cells=dict(
values=[['Buick', 'Ford', 'Chrysler'], # Headers (could change based on the source file)
[3.405, 3.72, 2.6323], # Min values
[210, 43, 100.29], # Max values
[4.62, 4.193, 4.98], # Average values
[4.425, 4.82, 3.28], # Median values
[5.04, 5.775, 5.34], # 90th percentile values
[5.83, 6.18, 6.36], # 95th percentile values
[7.182, 7.75, 11.01] # 99th percentile values
],
align = "left")
))
According to the docs at https://plotly.com/python/table/, the cells
argument argument expects a list of lists and can take a Pandas data frame (GREAT!).
Using the example in the docs, the code to pass a Pandas data frame would look something as follows:
# THIS IS THE EXAMPLE FROM THE DOCS (SHOWING THE USE OF A DATA FRAME)
fig = go.Figure(data=[go.Table(
header=dict(values=list(df.columns),
fill_color='paleturquoise',
align='left'),
cells=dict(values=[df.Rank, df.State, df.Postal, df.Population],
fill_color='lavender',
align='left'))
])
My most valiant attempt at this failed:
Filter by only the 'Score' records:
test_df_subset = test_df[(test_df['Metric'] == 'Score') & (test_df['Manufacturer'].isin(['Buick', 'Ford', 'Chrysler']))]
Create a pivot table:
temp_df = pd.pivot_table(data=test_df_subset,index=['Statistic', 'Manufacturer'])
Unstack the pivot table:
temp_df.unstack(0)
Question: how would I reshape my test_df
data frame to be able to pass it to the data
and cells
arguments in the go.Figure()
function?
Thanks in advance!
Upvotes: 1
Views: 286
Reputation: 29635
you were pretty close, here is one way
import plotly.graph_objects as go
cols_ = ["Manufacturer", "Min", "Max",
"Average", "Median", "90th",
"95th", "99th"]
manufacturers = ['Buick', 'Ford', 'Chrysler']
#this is what you are looking for
df_ = (test_df[test_df['Manufacturer'].isin(manufacturers)]
.set_index(['Manufacturer', 'Statistic'])
['Value'].unstack()
.reset_index()[cols_]
)
go.Figure(go.Table(
header=dict(
values=cols_,
font=dict(size=10),
align="left"
),
cells=dict(
values=df_.T, # note the T here
align = "left")
))
Compare to your method, I think df_
(in my notation) is equivalent to temp_df.unstack(0)['Value'].reset_index()[cols_]
with your notation and using cols_
to order them as expected
Upvotes: 1