Pandas and HTML tags

Question

I am trying to pull the tables off this site. When I load the URL with pd.read_html I get back a series of data frames as expected, but the issue is that the HTML tags that are in the cell of the tables are gone. Is there any way I can rip the tables and keep the HTML that is in the table cells using pandas?

import pandas as pd

df = pd.read_html('http://geppopotamus.info/game/tekken7fr/asuka/data.htm#page_top')

I want the cell to be this

翠勁
^ﾖﾐ

/上

but I get this

翠勁 ﾖﾐ /上

I have used beautiful soup to parse the HTML then passed the data to pandas by it still strips out the inner HTML.

Pandas and HTML tags

Answers (1)

Related Questions