RAY
RAY

Reputation: 7100

pd.read_excel throws PermissionError if file is open in Excel

Whenever I have the file open in Excel and run the code, I get the following error which is surprising because I thought read_excel should be a read only operation and would not require the file to be unlocked?

    Traceback (most recent call last):
  File "C:\Users\Public\a.py", line 53, in <module>
    main()
  File "C:\Users\Public\workspace\a.py", line 47, in main
    blend = plStream(rootDir);
  File "C:\Users\Public\workspace\a.py", line 20, in plStream
    df = pd.read_excel(fPath, sheetname="linear strategy", index_col="date", parse_dates=True)
  File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\pandas\io\excel.py", line 163, in read_excel
    io = ExcelFile(io, engine=engine)
  File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\pandas\io\excel.py", line 206, in __init__
    self.book = xlrd.open_workbook(io)
  File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\xlrd\__init__.py", line 394, in open_workbook
    f = open(filename, "rb")
PermissionError: [Errno 13] Permission denied: '<Path to File>'

Upvotes: 46

Views: 46153

Answers (14)

Prabin
Prabin

Reputation: 1

This is not a direct solution to the answer, but as a workaround my approach is to add hour, minute, and second to the filename so that permission error will not be a problem. Multiple files will be created as you do it multiple times. Here is an example I use in my work:

#add hms to file name to prevent permission error, each time new file will be created   
currentTime = datetime.now().strftime("%H-%M-%S")
op_file_name = f"chart_d_analysis_ta_{currentTime}.xlsx"
output_file = os.path.join(path_to_save_result, op_file_name)
master_df.to_excel(output_file, index=False)   

# open folder to open saved file
path = os.path.realpath(output_file)
os.startfile(path)

Upvotes: 0

Trinomial
Trinomial

Reputation: 11

I experienced this error also. Parsing the engine in the function call did not solve the problem. However, I noticed there were some dots/periods (".") in the file name, besides the one before the .xlsx file extension. Removing these dots from the file name, and then trying the import again, solved the problem.

Upvotes: 0

QuantumRifts
QuantumRifts

Reputation: 36

I've tried most of the solutions, but none worked, what i did find is that if i execute this line twice, it will work:

file = pd.ExcelFile("myfile.xlsm")

the error PermissionError: [WinError 5] permission denied will occur in the first execution, but not in the second. so basically this solves it for me :

try:
    file = pd.ExcelFile("myfile.xlsm")
except PermissionError :
    pass
file = pd.ExcelFile("myfile.xlsm")

i hope it helps.

Upvotes: 0

Freya W
Freya W

Reputation: 549

If you have your file inside a sharepoint / OneDrive folder or something similar, this may also be the root cause of the issue.

I had the same issue and after moving it out of my OneDrive to a local folder, reading the workbook with pd.read_excel as is (no other engine, no copying beforehand) worked fine.

Upvotes: 1

Chadee Fouad
Chadee Fouad

Reputation: 2948

Note: The solution below used to work but stopped working for me after a recent Windows 10 update...but it worked for some so it really depends on which version of Windows update you have.

As a workaround I suggest making python create a copy of the original file then read from the copy. After that the code should delete the copied file. It's a bit of extra work but should work.

Example

import shutil
shutil.copy("C://Test//Test.xlsx", "C://Test//koko.xlsx")

Upvotes: 3

Aktawa
Aktawa

Reputation: 21

Excel is blocking the access to the file. The way I worked around this problem is to copy the file to a temporary file and open this one instead.

Unfortunately most of the copy methods will fault with the same error. (shutils, os, !copy). Finally copying with the powershell solved the problem for me:


import subprocess
import os
import pandas as pd

#df = pd.read_excel('test.xlsx') // This will error when test.xlsx is opened in excel

p = subprocess.call(['powershell.exe', 'copy test.xlsx tempfile.xlsx'])
df = pd.read_excel('tempfile.xlsx', sheet_name="Tabelle2", na_filter=False)
os.remove('tempfile.xlsx')

print(df.shape)

Upvotes: 2

ccook5760
ccook5760

Reputation: 1460

I got this to work by first setting the working directory, then opening the file. Maybe something to do with shared drive permissions and read_excel function.

import os
import pandas as pd

os.chdir("c:\\Users\\...\\")

filepath = "...\\filename.xlsx"
sheetname = 'sheet1'

df_xls = pd.read_excel(filepath, sheet_name=sheetname, engine='openpyxl')

Upvotes: 0

Jimmar
Jimmar

Reputation: 4459

Generally Excel have a lot of restrictions when opening files (can't open the same file twice, can't open 2 different files with the same name ..etc).
I don't have excel on machine to test, but checking the docs for read_excel I've noticed that it allows you to set the engine.
from the stack trace you posted it seems like the error is thrown by xlrd which is the default engine used by pandas.

try using any of the other ones

Supported engines: “xlrd”, “openpyxl”, “odf”, “pyxlsb”, default “xlrd”.

so try with the rest, like

df = pd.read_excel(fPath, sheetname="linear strategy", index_col="date", parse_dates=True, engine="openpyxl")

I know this is not a real answer, but you might want to submit a bug report to pandas or xlrd teams.

Upvotes: 6

Hermes Morales
Hermes Morales

Reputation: 637

I fix this error simply closing the .xlsx file that was open.

Upvotes: 3

Vivs
Vivs

Reputation: 475

Mostly there is no issues in your code. [ If you publish the code it will be easier.] You need to change the permissions of the directory you are using so that all users have read and write permissions.

Upvotes: 0

GACharala
GACharala

Reputation: 39

I would suggest using the xlwings module instead which allows for greater functionality.

Firstly, you will need to load your workbook using the following line:

If the spreadsheet is in the same folder as your python script:

import xlwings as xw
workbook = xw.Book('myfile.xls')

Alternatively:

workbook = xw.Book('"C:\Users\...\myfile.xls')

Then, you can create your Pandas DataFrame, by specifying the sheet within your spreadsheet and the cell where your dataset begins:

df = workbook.sheets[0].range('A1').options(pd.DataFrame, 
                                            header=1,
                                            index=False, 
                                            expand='table').value

When specifying a sheet you can either specify a sheet by its name or by its location (i.e. first, second etc.) in the following way:

workbook.sheets[0] or workbook.sheets['sheet_name']

Lastly, you can simply install the xlwings module by using Pip install xlwings

Upvotes: 2

sara
sara

Reputation: 45

You can set engine='python' then you can run it even if the file is open

df = pd.read_excel(filename, engine = 'python')

Upvotes: -3

SKhan2312
SKhan2312

Reputation: 355

You may also want to check if the file has a password? Alternatively you can open the file with the password required using the code below:

import sys
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename, password = <-- enter your own filename and password
xlwb = xlApp.Workbooks.Open(filename, Password=password) 
# xlwb = xlApp.Workbooks.Open(filename)
xlws = xlwb.Sheets([insert number here]) # counts from 1, not from 0
print xlws.Name
print xlws.Cells(1, 1) # that's A1

Upvotes: -2

coreyb
coreyb

Reputation: 289

You can set engine = 'xlrd', then you can run the code while Excel has the file open.

df = pd.read_excel(filename, sheetname, engine = 'xlrd')

You may need to pip install xlrd if you don't have it

Upvotes: -2

Related Questions