Tobias Funke
Tobias Funke

Reputation: 1814

Downloaded xls from python resulting in broken formatting vs manual download

I am trying to download the excel from this page: https://webgate.ec.europa.eu/rasff-window/portal/index.cfm?event=notificationsList# and then extract data from the applicable cells.

Here is the code that I am using

import requests, os

os.chdir('Path')

dls = 'https://webgate.ec.europa.eu/rasff-window/portal/index.cfm?event=ExportToExcel&StartRow=0'

resp = requests.get(dls)

with open('tester.xls', 'wb') as output:
    output.write(resp.content)

The download is successful, but the formatting is completely messed up (due to the XML?)

I tried changing the file type but it did not help.

Any help is greatly appreciated!

Upvotes: 0

Views: 39

Answers (1)

import pandas as pd

df = pd.read_html(
    "https://webgate.ec.europa.eu/rasff-window/portal/index.cfm?event=notificationsList")[0]
df.drop(df.columns[-1], axis=1, inplace=True)

print(df)

df.to_csv("data.csv", index=False)

Output: view-online

enter image description here

Upvotes: 1

Related Questions