user8449270
user8449270

Reputation:

python check if the folder content existed

The purpose of this code is:

Read a csv file which contains a column for a list of file names

here is the csv file:

https://drive.google.com/open?id=0B5bJvxM9TZkhVGI5dkdLVzAyNTA

Then check a specific folder to check if the files exist or not

If its found a file is not in the list delete it

here is the code:

import pandas as pd
import os.path

data = pd.read_csv('data.csv')
names = data['title']
path = "C:\\Users\\Sayed\\Desktop\\Economic Data"

for file in os.listdir(path):
    os.path.exists(file)
    print(file)
    file = os.path.join(path, file)
    fileName = os.path.splitext(file)

    if fileName not in names:
        print('error')
        os.remove(file)

I modified the first code, and this is the new code and I got no error but the simply delete all the files in the directory

Upvotes: 1

Views: 376

Answers (2)

Ilja
Ilja

Reputation: 2114

Your path is the return value of the os.chdir() call. Which is obviously None.
You want to set path to the string representing the path ... leave the chdir out.

Upvotes: 1

cs95
cs95

Reputation: 402523

os.chdir does not return anything, so assigning the result to path means that path has None, which causes the error.

Since you're using pandas, here's a little trick to speed this up using pd.Series.isin.

root = "C:\Users\Sayed\Desktop\Economic Data"
files = os.listdir(root)

for f in data.loc[~data['title'].isin(files), 'title'].tolist():
    try:
        os.remove(os.path.join(root, f))
    except OSError:
        pass

Added a try-except check in accordance with EAFP (since I'm not doing an os.path.exists check here). Alternatively, you could add a filter based on existence using pd.Series.apply:

m = ~data['title'].isin(files) & data['title'].apply(os.path.exists)

for f in data.loc[m, 'title'].tolist():
    os.remove(os.path.join(root, f))

Upvotes: 2

Related Questions