Reputation: 455
I may just not understand pandas fully but I am getting some unexpected behavior when using read_html()
with the index_col
flag set, modifying the data frame, and then attempting to use to_html()
again.
Here is what I mean. I have this HTML file:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>index</th>
<th>Avg</th>
<th>Min</th>
<th>Max</th>
</tr>
</thead>
<tbody>
<tr>
<td>build1</td>
<td>55.102323</td>
<td>37.101219</td>
<td>60.7</td>
</tr>
</tbody>
</table>
I then use pandas read_html
as follows:
dataFrameList = pd.read_html('empty.html', index_col=0)
df = dataFrameList[0]
This produces a data frame as follows:
Avg Min Max
index
build1 55.102323 37.101219 60.7
I then have a small bit of test code that looks like this:
df.drop(['build1'], inplace=True)
df.loc['build2'] = [121212, 12443, 1290120]
print(df.to_html())
I get the following output:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Avg</th>
<th>Min</th>
<th>Max</th>
</tr>
<tr>
<th>index</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<th>build2</th>
<td>121212.0</td>
<td>12443.0</td>
<td>1290120.0</td>
</tr>
</tbody>
</table>
What did I do wrong? I have tried to set the flag to_html(.., index=False)
off but this gets rid of the build names (which I need).
My desired output (just so that it is clear) is as follows:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>index</th>
<th>Avg</th>
<th>Min</th>
<th>Max</th>
</tr>
</thead>
<tbody>
<tr>
<th>build2</th>
<td>121212.0</td>
<td>12443.0</td>
<td>1290120.0</td>
</tr>
</tbody>
</table>
Upvotes: 0
Views: 1179
Reputation: 361
There is a workaround:
df.insert(0, 'index', df.index)
print(df.to_html(index=False))
This produces the desired output (except for that <th>
in the second row, which, I guess, is a typo?).
Upvotes: 1