Reputation: 1801
I'm looking for help in below code where I can remove the files from the folder which not available in the given csv file. I read the input file in the pandas' data frame and convert it into the list then reading the fileName from the folder and comparing the fileName with the available file in the folder and if it exists continue if not remove. but it is removing all the files including the not matching files.
I only want to remove the files which are not present in the file I'm reading using pandas data frame.
import os
import pandas as pd
path = "Adwords/"
flist = pd.read_csv('C:/mediaops/mapping/adword/file_name.csv')
file_name = flist['fileName'].tolist()
for filename in os.listdir(path):
print(filename)
if filename == file_name:
continue
elif filename != file_name:
os.remove(filename)
Upvotes: 1
Views: 7706
Reputation: 169
when I implemented these answers it deleted all files in the dir, not in my list. So I wrote one for any weary traveler that may need this script. User needs to add in the path for where their files are and make a csvfile with the basename of the files that they want to keep. you can also add in the extention of the files that you want to look at if they happen to all the same.
The process is making the csv into a list based on each element in the first column and then checking to see if the files in the current dir are present in the list. If they are not then remove.
import os
import csv
import argparse
import sys
import pathlib
data_path = path = "/path/to/your/dir"
csv_guide = "filenamestokeep.csv"
csv_path = os.path.join(data_path, csv_guide)
ext = "input.your.extention.of.files.to.look.at.as.ext, like .txt"
with open(csv_path, 'r') as csvfile:
good_files = []
for n in csv.reader(csvfile):
if len(n) > 0: good_files.append(n[0])
print(good_files)
all_files = os.listdir(data_path)
for filename in all_files:
if filename.endswith(ext) and filename not in good_files:
print(filename)
full_file_path = os.path.join(data_path, filename)
print("File to delete: {} ".format(filename))
os.remove(full_file_path)
else:
print(f"Ignored -- {filename}")
Upvotes: 0
Reputation: 20490
In your original solution, you are trying to do filename == file_name
and filename != file_name
, but you cannot do that.
See filename
is a string
and file_name
is a list, and you cannot use ==
to compare them, you need to use membership
operators like in
and not in
, like if filename not in file_name:
which I did in my answer below
(Thanks to Tobias's Answer)
Now since that is out of the window, now you can iterate through all files using os.listdir, then use os.remove to remove the necessary files, in addition using os.path.join to get the full path of the file!
import os
#List all files in path
for filename in os.listdir(path):
#If file is not present in list
if filename not in file_name:
#Get full path of file and remove it
full_file_path = os.path.join(path, filename)
os.remove(full_file_path)
Upvotes: 4
Reputation: 82889
The problem is that file_name
is a list
if string, whereas filename
is a single string, so the check filename != file_name
will always be true and the file thus always be removed. Instead, use in
and not in
to check whether the string is (not) in the list of strings. Also, using a set
would be faster. Also, those variable names are really confusing.
set_of_files = set(file_name)
for filename in os.listdir(path):
if filename not in set_of_files:
os.remove(filename)
Also, as noted in Devesh's answer, you may have to join
the filename to the path in order to be able to actually remove the file.
Upvotes: 2
Reputation: 11228
for filename in os.listdir(path):
print(filename)
if filename not in file_name:
os.remove(filename)
Upvotes: 6