Reputation: 11
I'm trying to automate the downloading of a text file from a shared link that was sent to me by email. The original link is to a folder containing two files but I got the direct download link of the file that I need which is:
https://'abc'-my.sharepoint.com/personal/gamma_'abc'/_layouts/15/download.aspx?UniqueId=a0db276e%2Ddf75%2D49b7%2Db671%2D1c49e365ef3f
When I enter the above url into a web browser I get the popup option to open or download the file. I'm trying to write some Python code to download the file automatically and this what I've come up with so far
import requests
url = "https://<abc>-my.sharepoint.com/personal/gamma_<abc>/_layouts/15" \
"/download.aspx?UniqueId=a0db276e%2Ddf75%2D49b7%2Db671%2D1c49e365ef3f "
hdr = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.5',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Connection': 'keep-alive'}
myfile = requests.get(url, headers=hdr)
open('c:/users/scott/onedrive/desktop/gamma.las', 'wb').write(myfile.content)
I originally tried without the user agent and when I opened gamma.las there was only 403 FORBIDDEN in the file. If I send the header too then the file contains HTML for what looks like a Microsoft login page, so I'm assuming that I'm missing some authentication step.
I have no affiliation with this organization - someone sent me this link via email for me to download a text file which works fine through the browser but not via Python. I don't log in to anything to get it as I have no username or password with this domain.
Am I able to do this using Requests? If not, am I able to use REST API without user credentials for this company's Sharepoint?
Upvotes: 1
Views: 762
Reputation: 1245
This is how you do it
import requests, mimetypes
# Specify file sharepoint URL
file_url = 'https://organisarion-my.sharepoint.com/:b:/p/user1/Eej3XCFj7N1AqErjlxrzebgBO7NJMV797ClDPuKkBEi6zg?e=dJf2tJ'
# Specify desination filename
save_path = 'file'
# Make GET request with allow_redirect
res = requests.get(file_url, allow_redirects=True)
if res.status_code == 200:
# Get redirect url & cookies for using in next request
new_url = res.url
cookies = res.cookies.get_dict()
for r in res.history:
cookies.update(r.cookies.get_dict())
# Do some magic on redirect url
new_url = new_url.replace("onedrive.aspx","download.aspx").replace("?id=","?SourceUrl=")
# Make new redirect request
response = requests.get(new_url, cookies=cookies)
if response.status_code == 200:
content_type = response.headers.get('Content-Type')
print(content_type)
file_extension = mimetypes.guess_extension(content_type)
print(response.content)
if file_extension:
destination_with_extension = f"{save_path}{file_extension}"
else:
destination_with_extension = save_path
with open(destination_with_extension, 'wb') as file:
for chunk in response.iter_content(1024):
file.write(chunk)
print("File downloaded successfully!")
else:
print("Failed to download the file.")
print(response.status_code)
A short explanation would be to GET the cookies & redirect url, use these cookies for making new GET request
Upvotes: 1
Reputation: 12245
Not supplying your credentials most probably means you are (implicitly) using built-in windows authentication in your organization. Check out if this helps: Handling windows authentication while accessing url using requests
The python library mentioned there to handle built-in windows auth is requests-negotiate-sspi. Not sure, if it's going to work with federation (your website ends with ".sharepoint.com" meaning you are probably using federation as well), but may be worth trying.
So, I would try something like this (I doubt headers really matter in your case, but you could try adding them as well)
import requests
from requests_negotiate_sspi import HttpNegotiateAuth
url = ...
myfile = requests.get(url, auth=HttpNegotiateAuth())
Upvotes: 0