Read xlsx stored on sharepoint location with openpyxl in python?

Gunay Anach picture Gunay Anach · Dec 10, 2015 · Viewed 12.9k times · Source

quick one.

I have XLSX file located on sharepoint drive and cannot open it using openpyxl in python, it works well if it is stored on my local drive.

I tried this.

from openpyxl import load_workbook
wb = load_workbook('https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx')

Throws up this exception:

C:\Anaconda\lib\site-packages\openpyxl\reader\excel.py in load_workbook(filename, use_iterators, keep_vba, guess_types, data_only)
    123     except (BadZipfile, RuntimeError, IOError, ValueError):
    124         e = exc_info()[1]
--> 125         raise InvalidFileException(unicode(e))
    126     wb = Workbook(guess_types=guess_types, data_only=data_only)
    127 

InvalidFileException: [Errno 22] invalid mode ('rb') or filename: 'https://...

Am I missing something? I need to read the content of one of the sheets in python.


EDIT:

Using crussell's advice, I receive 401 UNAUTHORIZED:

import requests
import urllib
from openpyxl import load_workbook
from requests.auth import HTTPBasicAuth

file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"

username = 'PotatoUser'
password = 'PotatoPassword'

resp=requests.get(file, auth=HTTPBasicAuth(username, password))
print(resp.content)

Seems like sharepoint and requests are not compatible, with both Digest Authentication and Basic Authentication http://docs.python-requests.org/en/latest/user/authentication/

Answer

crussell picture crussell · Dec 10, 2015

Instead of trying to load directly from a web-address, try using urllib.

import urllib
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
urllib.urlretrieve(file,"test.xlsx")

From further research, urllib is apparently eschewed by requests.Try this:

import requests
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"

username = 'myUsername'
password = 'myPassword'

resp=requests.get(file, auth=HTTPBasicAuth(username, password))
output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()

To get requests installed:

pip install requests