NeStack
NeStack

Reputation: 2014

Copy folders and subfolders, but only first files in subfolders with python

I have a file structure containing folders, subfolders, subsubfolders a.s.o. Only the very last subsubfolders contain files. I would like to copy the file structure, while not copying all the files, but only the first file (or just one file) from each of the subsubfolders. I noticed that shutil.copytree(src, dst) can do something similar, but I don't know how to restrict it to copy only the first file from the subsubfolders. Thanks for advice on how to accomplish this!

My file structure:

folder1
  subfolder11
    subsubfolder111
      file1
      file2
      file3...
folder2
  sulfolder21
    file1
    file2
    file3...

Desired structure:

folder1
  subfolder11
    subsubfolder111
      file1
folder2
  sulfolder21
    file1

Upvotes: 0

Views: 1044

Answers (2)

NeStack
NeStack

Reputation: 2014

Based on GuillaumeJ's answer one can generalize the copying to N files:

# limit of files to copy
N=3

for path, folders, files in os.walk(p1):

    # you might want to sort files first before executing the below
    for file_ in files[:N]:
    # if not files: continue

        src = os.path.join(path, file_)
        dst_path = path.replace(p1, '') + os.sep
        dst_folder = p2 + dst_path

        # create the target dir if doesn't exist
        if not os.path.exists(dst_folder):
            os.makedirs(dst_folder)

        # create dst file with only the first file
        dst = p2 + dst_path + file_

        # copy the file
        shutil.copy2(src, dst)

Upvotes: 1

GuillaumeJ
GuillaumeJ

Reputation: 66

I don't know if you can customize that much copytree, but with os.walk and parsing the folders you can do it, here is an example:

import os
import shutil

p1 = r"C:\src_dir"
p2 = r"C:\target_dir"

for path, folders, files in os.walk(p1):

    if not files: continue

    src = os.path.join(path, files[0])
    dst_path = path.replace(p1, '') + os.sep
    dst_folder = p2 + dst_path

    # create the target dir if doesn't exist
    if not os.path.exists(dst_folder):
        os.makedirs(dst_folder)

    # create dst file with only the first file
    dst = p2 + dst_path + files[0]

    # copy the file
    shutil.copy2(src, dst)

Upvotes: 2

Related Questions