Paolorossi
Paolorossi

Reputation: 57

Delete copies of files in a folder

I have this folder.

of

Let's consider the files: sub-OAS30027_ses-d1300_run-01_T1w.nii.gz and sub-OAS30027_ses-d1300_run-02_T1w.nii.gz. They have the same initial part of the name, that is sub-OAS30027_ses-d1300.

I would like to code a script in Python that extract only one file among the ones with the same sub-OAS30027_ses-d1300, among the one with the same sub-OAS30031_ses-d0427 and so on. It's not important which file is extracted, just one.

This because sub-OAS30027_ses-d1300_run-01_T1w.nii.gz and sub-OAS30027_ses-d1300_run-02_T1w.nii.gz are like copies and i don't want them.

Could you help me ?

Upvotes: 2

Views: 60

Answers (2)

Maxim
Maxim

Reputation: 286

I tried to keep it as simple as possible. I hope this helps:

import os

directory = 'directory_name' # put in the directory you want to search through
duplicate_file_lst = []

# loop through directory files
for filename in os.listdir(directory):
   if filename.startswith("sub-OAS30027_ses-d1300"):
       duplicate_file_lst.append(filename)

# Only keeps the first file in the list        
for file in duplicate_file_lst:
   if file != duplicate_file_lst[0]:
       os.remove(file)

Upvotes: 1

Roshin Raphel
Roshin Raphel

Reputation: 2709

Use the re and os modules :

PS : always have a copy of the original files if something goes wrong, it can be used again.

import os,re
file = os.listdir()
match = []
for i in file:
    t = re.findall('_ses\-d(.*?)_',i)
    if t :
        if t[0] not in match :
            match.append(t[0])
        else :
            os.remove(i)

Upvotes: 2

Related Questions