Reputation: 323
I create a dataframe and export to an html table. However the headers are off as below
How can I combine the index name row, and the column name row?
I want the table header to look like this:
but it currently exports to html like this:
I create the dataframe as below (example):
data = [{'Name': 'A', 'status': 'ok', 'host': '1', 'time1': '2020-01-06 06:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'A', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 06:31:06', 'time2': '-'}, {'Name': 'B', 'status': 'Alert', 'host': '1', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'}, {'Name': 'B', 'status': 'ok', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'B', 'status': 'ok', 'host': '4', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'Alert', 'host': '2', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '3', 'time1': '2020-01-06 10:31:06', 'time2': '2020-02-06 21:10:00'},{'Name': 'C', 'status': 'ok', 'host': '4', 'time1': '-', 'time2': '-'}]
df = pandas.DataFrame(data)
df.set_index(['Name', 'status', 'host'], inplace=True)
html_body = df.to_html(bold_rows=False)
The index is set to have hierarchical rows, for easier reading in an html table:
print(df)
time1 time2
Name status host
A ok 1 2020-01-06 06:31:06 2020-02-06 21:10:00
2 2020-01-06 06:31:06 -
B Alert 1 2020-01-06 10:31:06 2020-02-06 21:10:00
ok 2 2020-01-06 10:31:06 2020-02-06 21:10:00
4 2020-01-06 10:31:06 2020-02-06 21:10:00
C Alert 2 2020-01-06 10:31:06 2020-02-06 21:10:00
ok 3 2020-01-06 10:31:06 2020-02-06 21:10:00
4 - -
The only solution that I've got working is to set every column to index. This doesn't seem practical tho, and leaves an empty row that must be manually removed:
Upvotes: 7
Views: 2363
Reputation: 31
If you want to use a Dataframe Styler to perform a lot of wonderful formatting on your table, the elements, and the contents, then you might need a slight change to piRSquared's answer, as I did.
style.to_html() added non-breaking spaces which made tag.contents always return true, and thus yielded no change to the table. I modified the lambda to account for this, which revealed another issue.
lambda tag: (not tag.contents) or '\xa0' in tag.contents
Styler.to_html() lacks the header kwarg - I am guessing this is the source of the issue. I took a slightly different approach - Move the second row headers into the first row, and then destroy the second header row.
It seems pretty generic and reusable for any multi-indexed dataframe.
df_styler = summary_df.style
# Use the df_styler to change display format, color, alignment, etc.
raw_html = df_styler.to_html()
soup = BeautifulSoup(raw_html,'html.parser')
head = soup.find('thead')
trs = head.find_all('tr')
ths0 = trs[0].find_all(lambda tag: (not tag.contents) or '\xa0' in tag.contents)
ths1 = trs[1].find_all(lambda tag: (tag.contents) or '\xa0' not in tag.contents)
for blank, filled in zip(ths0, ths1):
blank.replace_with(filled)
trs[1].decompose()
final_html_str = soup.decode_contents()
Success - two header rows condensed into one
Big Thanks to piRSquared for the starting point of Beautiful soup!
Upvotes: 3
Reputation: 294348
import pandas as pd
from IPython.display import HTML
l0 = ('Foo', 'Bar')
l1 = ('One', 'Two')
ix = pd.MultiIndex.from_product([l0, l1], names=('L0', 'L1'))
df = pd.DataFrame(1, ix, [*'WXYZ'])
HTML(df.to_html())
Hack the HTML result from df.to_html(header=False)
. Pluck out the empty cells in the table head and drop in the column names.
from bs4 import BeautifulSoup
html_doc = df.to_html(header=False)
soup = BeautifulSoup(html_doc, 'html.parser')
empty_cols = soup.find('thead').find_all(lambda tag: not tag.contents)
for tag, col in zip(empty_cols, df):
tag.string = col
HTML(soup.decode_contents())
Upvotes: 4