DKM
DKM

Reputation: 1801

Python remove files from folder which are not in list

I'm looking for help in below code where I can remove the files from the folder which not available in the given csv file. I read the input file in the pandas' data frame and convert it into the list then reading the fileName from the folder and comparing the fileName with the available file in the folder and if it exists continue if not remove. but it is removing all the files including the not matching files.

I only want to remove the files which are not present in the file I'm reading using pandas data frame.

import os
import pandas as pd

path = "Adwords/"

flist = pd.read_csv('C:/mediaops/mapping/adword/file_name.csv')

file_name = flist['fileName'].tolist()

for filename in os.listdir(path):
    print(filename)
    if filename == file_name:
        continue
    elif filename != file_name:
        os.remove(filename)

Upvotes: 1

Views: 7706

Answers (4)

Katie S
Katie S

Reputation: 169

when I implemented these answers it deleted all files in the dir, not in my list. So I wrote one for any weary traveler that may need this script. User needs to add in the path for where their files are and make a csvfile with the basename of the files that they want to keep. you can also add in the extention of the files that you want to look at if they happen to all the same.

The process is making the csv into a list based on each element in the first column and then checking to see if the files in the current dir are present in the list. If they are not then remove.

import os
import csv
import argparse
import sys
import pathlib

data_path = path = "/path/to/your/dir"
csv_guide = "filenamestokeep.csv"
csv_path = os.path.join(data_path, csv_guide)
ext = "input.your.extention.of.files.to.look.at.as.ext, like .txt"
with open(csv_path, 'r') as csvfile:
    good_files = []
    for n in csv.reader(csvfile):
        if len(n) > 0: good_files.append(n[0])
    print(good_files)
    all_files = os.listdir(data_path)
    for filename in all_files:
        if filename.endswith(ext) and filename not in good_files:
            print(filename)
            full_file_path = os.path.join(data_path, filename)
            print("File to delete: {} ".format(filename))
            os.remove(full_file_path)
        else:
           print(f"Ignored -- {filename}")

Upvotes: 0

Devesh Kumar Singh
Devesh Kumar Singh

Reputation: 20490

In your original solution, you are trying to do filename == file_name and filename != file_name, but you cannot do that.

See filename is a string and file_name is a list, and you cannot use == to compare them, you need to use membership operators like in and not in, like if filename not in file_name: which I did in my answer below (Thanks to Tobias's Answer)

Now since that is out of the window, now you can iterate through all files using os.listdir, then use os.remove to remove the necessary files, in addition using os.path.join to get the full path of the file!

import os

#List all files in path
for filename in os.listdir(path):
  
    #If file is not present in list
    if filename not in file_name:
        #Get full path of file and remove it
        full_file_path = os.path.join(path, filename)
        os.remove(full_file_path)

Upvotes: 4

tobias_k
tobias_k

Reputation: 82889

The problem is that file_name is a list if string, whereas filename is a single string, so the check filename != file_name will always be true and the file thus always be removed. Instead, use in and not in to check whether the string is (not) in the list of strings. Also, using a set would be faster. Also, those variable names are really confusing.

set_of_files = set(file_name)
for filename in os.listdir(path):
    if filename not in set_of_files:
        os.remove(filename)

Also, as noted in Devesh's answer, you may have to join the filename to the path in order to be able to actually remove the file.

Upvotes: 2

sahasrara62
sahasrara62

Reputation: 11228

for filename in os.listdir(path):
    print(filename)
    if filename not in file_name:
        os.remove(filename)

Upvotes: 6

Related Questions