urllib download excel file from php link

Question

I am trying to download a list of xls files from a url using urllib.urlretrieve (python 2.7). I am able to get the file, however there is a which make it unreadable in excel.

If I remove that script tag from the xls, the file opens correctly in excel.

EDIT - Here is my solution from pypypy:

import urllib

files= ['a','b', 'c', 'd', 'e', 'f']

url = 'http://www.thewebsite.com/data/dl_xls.php?bid='

for f in files:
    input_xls =  f + '_in.xls'
    urllib.urlretrieve(url + f, input_xls)
    output = open(f + '_out.xls', "wb")
    with open(input_xls, "rb") as i:
        output.write(re.sub('', "", i.read(), re.I))
        i.close()
        output.close()

pypypy · Accepted Answer

Try building a Regex to match the script tag and remove it i.e

import re
re.sub('', "", content, re.I)

This will substitute any script tags in the content for "".

urllib download excel file from php link

Answers (1)

Related Questions