Jacob Hilbert
Jacob Hilbert

Reputation: 48

Display html string value on pandas dataframe

Say I have a dataframe with string values that contains some HTML

my_dict = {"a":{"content":"""
<p>list of important things</p>
<ul>
<li>c</li>
<li>d</li>
</ul>
"""}}

df = pd.DataFrame.from_dict(my_dict,orient='index')

The result is to be expected:

I'd like to export the dataframe as HTML such that my HTML string works inside the table cells.

What I've tried

I'm aware of DataFrame.to_html(escape=False), which produces:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>content</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>a</th>
      <td>\n<p>list of important things</p>\n<ul>\n<li>c</li>\n<li>d</li>\n</ul>\n</td>
    </tr>
  </tbody>
</table>

which looks wrong:

enter image description here

because that HTML has a literal \\n, so I think the method has taken the repr of the string value when inserting it into the HTML conversion of the dataset.

I know I could get away replacing the scaped \\n into \n again, which looks like it should:

enter image description here

But I'd like to know if there is some way to tell pandas to insert the literal string values of the dataframe into the HTML, not the repr ones. I don't understand half of the kwargs for .to_html(), so I don't know if that's possible.

Upvotes: 1

Views: 2973

Answers (1)

ThePyGuy
ThePyGuy

Reputation: 18406

I'd like to export the dataframe as HTML such that my HTML string works inside the table cells.

If so, you may want to consider replacing \n by HTML new line character ie. <br> if you want to get newline for it or you can just replace it by an empty string.

df['content'] = df['content'].str.replace('\n', '<br>')
df.to_html('html.html', escape=False)

And if you don't want to replace the dataframe itself, you can let pandas handle it by passing it as a formatter:

df.to_html('html.html', 
           formatters = {'content': lambda k: k.replace('\n', '<br>')}, 
           escape=False)

And if you just completely want to get rid of new line, you can just replace it by empty string, either in dataframe itself or passing as a formatter.

df.to_html('html.html', 
           formatters = {'content': lambda k: k.replace('\n', '')}, 
           escape=False)

Upvotes: 3

Related Questions