Unable to programmatically download xls file

Question

i can manually download this file by pasting the url in a browser: https://www.aaii.com/files/surveys/sentiment.xls

However, when i try to programmatically do this, i have no luck. Depending on the library i use (requests, urlib, urlib3), the error is either 403 or simply some html with text 'request unsuccessful' is returned. What's strange is that it worked a few times - i was able to download the excel file. then it would stop without any coding changing. it's quite strange and sporadic.

Wondering if someone can try this code to see if they have the same issue or can see if there is anything i am doing incorrectly

UPDATE: seems if i wait a while and try running the code once more, it works. Its as though the server may have limit on number of request in a given timeframe. would be good if someone can see if that is happening to them also

import pandas as pd
import requests

url="https://www.aaii.com/files/surveys/sentiment.xls"

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
    'Accept': '.xls,.xlsx,application/csv,application/excel,application/vnd.msexcel,application/vnd.ms-excel,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
    'Accept-Encoding': 'gzip, deflate',
    'Accept-Language': 'en-US,en;q=0.9',
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
    'DNT': '1'
    }

resp = requests.get(url=url, headers=headers)
data = resp.content
print(data)
with open('test.xls', 'wb') as output:
    output.write(data)

df=pd.read_excel(data)
# df=pd.read_excel(url, header=headers)

Unable to programmatically download xls file

Answers (1)

Related Questions