Reputation: 1212
quick one.
I have XLSX file located on sharepoint drive and cannot open it using openpyxl in python, it works well if it is stored on my local drive.
I tried this.
from openpyxl import load_workbook
wb = load_workbook('https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx')
Throws up this exception:
C:\Anaconda\lib\site-packages\openpyxl\reader\excel.py in load_workbook(filename, use_iterators, keep_vba, guess_types, data_only)
123 except (BadZipfile, RuntimeError, IOError, ValueError):
124 e = exc_info()[1]
--> 125 raise InvalidFileException(unicode(e))
126 wb = Workbook(guess_types=guess_types, data_only=data_only)
127
InvalidFileException: [Errno 22] invalid mode ('rb') or filename: 'https://...
Am I missing something? I need to read the content of one of the sheets in python.
EDIT:
Using crussell's advice, I receive 401 UNAUTHORIZED:
import requests
import urllib
from openpyxl import load_workbook
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
username = 'PotatoUser'
password = 'PotatoPassword'
resp=requests.get(file, auth=HTTPBasicAuth(username, password))
print(resp.content)
Seems like sharepoint and requests are not compatible, with both Digest Authentication and Basic Authentication http://docs.python-requests.org/en/latest/user/authentication/
Upvotes: 11
Views: 35896
Reputation: 179
Instead of trying to load directly from a web-address, try using urllib.
import urllib
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
urllib.urlretrieve(file,"test.xlsx")
From further research, requests may be preferred over urllib. Try this:
import requests
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
username = 'myUsername'
password = 'myPassword'
resp=requests.get(file, auth=HTTPBasicAuth(username, password))
output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()
To get requests installed:
pip install requests
Upvotes: 4
Reputation: 29635
If the SP is internal, it might work by removing "https:" in the name you put in the load_workbook()
.
from openpyxl import load_workbook
file = '//content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx'
wb = load_workbook(file)
without authentication if your account at work is directly connected to SP. Otherwise, at my work, we use NTML authentification, that you can do by using HttpNtlmAuth
from the library request_ntml
.
Let me know if it works or if you are still interested by this issue, I could give you example with request_ntml
Upvotes: 0
Reputation: 46759
You probably first need to download it first rather than opening it directly. The following approach should work:
import urllib2
from openpyxl import load_workbook
import StringIO
data = urllib2.urlopen("https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx")
xlsx = data.read()
wb = load_workbook(StringIO.StringIO(xlsx))
Python's StringIO
could be used to make the downloaded data appear as a file interface.
Upvotes: 0