Styne J
Styne J

Reputation: 195

Copying files in python using shutil

I have the following directory structure:

 -mailDir
     -folderA
         -sub1
         -sub2
         -inbox
            -1.txt
            -2.txt
            -89.txt
            -subInbox
            -subInbox2
     -folderB
         -sub1
         -sub2
         -inbox
            -1.txt
            -2.txt
            -200.txt
            -577.txt

The aim is to copy all the txt files under inbox folder into another folder. For this I tried the below code

import os
from os import path
import shutil

rootDir = "mailDir"
destDir = "destFolder"

eachInboxFolderPath = []
for root, dirs, files in os.walk(rootDir):
    for dirName in dirs:
        if(dirName=="inbox"):
            eachInboxFolderPath.append(root+"\\"+dirName)

for ii in eachInboxFolderPath:
    for i in os.listdir(ii):
        shutil.copy(path.join(ii,i),destDir)

If the inbox directory only has .txt files then the above code works fine. Since the inbox folder under folderA directory has other sub directory along with .txt files, the code returns permission denied error. What I understood is shutil.copy won't allow to copy the folders.

The aim is to copy only the txt files in every inbox folder to some other location. If the file names are same in different inbox folder I have to keep both file names. How we can improve the code in this case ? Please note other than .txt all others are folders only.

Upvotes: 0

Views: 1464

Answers (3)

Sumit
Sumit

Reputation: 2426

Here you go:

import os
from os import path
import shutil

destDir = '<absolute-path>'
for root, dirs, files in os.walk(os.getcwd()):
    # Filter out only '.txt' files.
    files = [f for f in files if f.endswith('.txt')]
    # Filter out only 'inbox' directory.
    dirs[:] = [d for d in dirs if d == 'inbox']

for f in files:
    p = path.join(root, f)
    # print p
    shutil.copy(p, destDir)

Quick and simple.

sorry, I forgot the part where, you also need unique file names as well. The above solution only works for distinct file names in a single inbox folder.

For copying files from multiple inboxes and having a unique name in the destination folder, you can try this:

import os
from os import path
import shutil

sourceDir = os.getcwd()
fixedLength = len(sourceDir)

destDir = '<absolute-path>'

filteredFiles = []

for root, dirs, files in os.walk(sourceDir):
    # Filter out only '.txt' files in all the inbox directories.
    if root.endswith('inbox'):
        # here I am joining the file name to the full path while filtering txt files
        files = [path.join(root, f) for f in files if f.endswith('.txt')]
        # add the filtered files to the main list
        filteredFiles.extend(files)

# making a tuple of file path and file name
filteredFiles = [(f, f[fixedLength+1:].replace('/', '-')) for f in filteredFiles]

for (f, n) in filteredFiles:
    print 'copying file...', f
    # copying from the path to the dest directory with specific name
    shutil.copy(f, path.join(destDir, n))

print 'copied', str(len(filteredFiles)), 'files to', destDir

If you need to copy all files instead of just txt files, then just change the condition f.endswith('.txt') to os.path.isfile(f) while filtering out the files.

Upvotes: 0

Mason McGough
Mason McGough

Reputation: 544

Just remembered that I once wrote several files to solve this exact problem before. You can find the source code here on my Github.

In short, there are two functions of interest here:

  • list_files(loc, return_dirs=False, return_files=True, recursive=False, valid_exts=None)
  • copy_files(loc, dest, rename=False)

For your case, you could copy and paste these functions into your project and modify copy_files like this:

def copy_files(loc, dest, rename=False):
    # get files with full path
    files = list_files(loc, return_dirs=False, return_files=True, recursive=True, valid_exts=('.txt',))

    # copy files in list to dest
    for i, this_file in enumerate(files):
        # change name if renaming
        if rename:
            # replace slashes with hyphens to preserve unique name
            out_file = sub(r'^./', '', this_file)
            out_file = sub(r'\\|/', '-', out_file)
            out_file = join(dest, out_file)
            copy(this_file, out_file)
            files[i] = out_file
        else:
            copy(this_file, dest)

    return files

Then just call it like so:

copy_files('mailDir', 'destFolder', rename=True)

The renaming scheme might not be exactly what you want, but it will at least not override your files. I believe this should solve all your problems.

Upvotes: 0

Mason McGough
Mason McGough

Reputation: 544

One simple solution is to filter for any i that does not have the .txt extension by using the string endswith() method.

import os
from os import path
import shutil

rootDir = "mailDir"
destDir = "destFolder"

eachInboxFolderPath = []
for root, dirs, files in os.walk(rootDir):
    for dirName in dirs:
        if(dirName=="inbox"):
            eachInboxFolderPath.append(root+"\\"+dirName)

for ii in eachInboxFolderPath:
    for i in os.listdir(ii):
         if i.endswith('.txt'):
            shutil.copy(path.join(ii,i),destDir)

This should ignore any folders and non-txt files that are found with os.listdir(ii). I believe that is what you are looking for.

Upvotes: 2

Related Questions