Rami.K
Rami.K

Reputation: 199

IOError: [Errno 2] No such file or directory: but the files are there...

If you do print filename in the for loop #commented below, it gives you all the file names in the directory. yet when I call pd.ExcelFile(filename) it returns that there is no file with the name of : [the first file that ends with '.xlsx' What am I missing? p.s: the indentation below is right, the if is under the for in my code, but it doesn't show this way here..

for filename in os.listdir('/Users/ramikhoury/PycharmProjects/R/excel_files'):
if filename.endswith(".xlsx"):
    month = pd.ExcelFile(filename)
    day_list = month.sheet_names
    i = 0
    for day in month.sheet_names:
        df = pd.read_excel(month, sheet_name=day, skiprows=21)
        df = df.iloc[:, 1:]
        df = df[[df.columns[0], df.columns[4], df.columns[8]]]
        df = df.iloc[1:16]
        df['Date'] = day
        df = df.set_index('Date')
        day_list[i] = df
        i += 1

    month_frame = day_list[0]
    x = 1
    while x < len(day_list):
        month_frame = pd.concat([month_frame, day_list[x]])
        x += 1

    print filename + ' created the following dataframe: \n'
    print month_frame  # month_frame is the combination of the all the sheets inside the file in one dataframe !

Upvotes: 1

Views: 975

Answers (3)

Matthew Story
Matthew Story

Reputation: 3783

The issue is that you are trying to open a relative file-path from a different directory than the one you are listing. Rather than using os it is probably better to use a higher level interface like pathlib:

import pathlib
for file_name in pathlib.Path("/Users/ramikhoury/PycharmProjects/R/excel_files").glob("*.xslx"):
    # this produces full paths for you to use

pathlib was added in Python 3.4 so if you are using an older version of python, your best bet would be to use the much older glob module, which functions similarly:

import glob
for file_name in glob.glob("/Users/ramikhoury/PycharmProjects/R/excel_files/*.xslx"):
    # this also produces full paths for you to use

If for some reason you really need to use the low-level os interface, the best way to solve this is by making use of the dir_fd optional argument to open:

# open the target directory
dir_fd = os.open("/Users/ramikhoury/PycharmProjects/R/excel_files", os.O_RDONLY)
try:
    # pass the open file descriptor to the os.listdir method
    for file_name in os.listdir(dir_fd):
        # you could replace this with fnmatch.fnmatch
        if file_name.endswith(".xlsx"):
            # use the open directory fd as the `dir_fd` argument
            # this opens file_name relative to your target directory
            with os.fdopen(os.open(file_name, os.O_RDONLY, dir_fd=dir_fd)) as file_:
                # do excel bits here
finally:
    # close the directory
    os.close(dir_fd)

While you could accomplish this fix by changing directories at the top of your script (as suggested by another answer), this has the side-effect of changing the current working directory of your process which is often undesirable and may have negative consequences. To make this work without side-effects requires you to chdir back to the original directory:

# store cwd
original_cwd = os.getcwd()
try:
    os.chdir("/Users/ramikhoury/PycharmProjects/R/excel_files")
    # do your listdir, etc
finally:
    os.chdir(original_cwd)

Note that this introduces a race condition into your code, as original_cwd may be removed or the access controls for that directory might be changed such that you cannot chdir back to it, which is precisely why dir_fd exists.

dir_fd was added in Python 3.3, so if you are using an older version of Python I would recommend just using glob rather than the chdir solution.

For more on dir_fd see this very helpful answer.

Upvotes: 1

Raj Josyula
Raj Josyula

Reputation: 150

Your "if" statement must be inside the for loop

Upvotes: 1

Mia
Mia

Reputation: 2676

The problem is that your work directory is not the same as the directory you are listing. Since you know the absolute path of the directory, the easiest solution is to add os.chdir('/Users/ramikhoury/PycharmProjects/R/excel_files') to the top of your file.

Upvotes: 3

Related Questions