Leo
Leo

Reputation: 43

Is there a way to copy random files from source to destination?

I am working with 257 folders that contain images. I would like to copy 80% of the images of each of my folders and copy them into a new folder called Training. The 20% remaining images in each of the 257 folders would be copied to a new folder called Test. At the end, I would have two new folders Training and Test, where Training contains 80% of randomly selected images from each of my 257 folders while Test would have 20%.

Is there any Python function that can achieve this ? I did some research and all I found was the function shutil which copies all the files from a source to a destination folder.

Thank you

Upvotes: 2

Views: 366

Answers (2)

demian-wolf
demian-wolf

Reputation: 1858

You can use a script like the following:

import os
import shutil
import random

SOURCES = [...]  # autogenerated or predefined constant

TRAINING = "Training"
TEST = "Test"


def main():
    os.makedirs(TRAINING, exist_ok=True)
    os.makedirs(TEST, exist_ok=True)

    for src_dir in SOURCES:
        files = os.listdir(src_dir)
        random.shuffle(files)

        sep = round(len(files) * 0.8)

        for file in files[:sep]:
            shutil.copy(
                os.path.join(src_dir, file),
                TRAINING,
            )

        for file in files[sep:]:
            shutil.copy(
                os.path.join(src_dir, file),
                TEST,
            )


if __name__ == "__main__":
    main()

Beware that files with the same filename are silently overwritten when copied to the same destination.

Upvotes: 4

Leo
Leo

Reputation: 43

Here is the code I ended up using (Thanks to Demian for the input):

#import libraries
import random
import shutil
import os

# Creating a Training and Testing folders
os.makedirs('Training')
os.makedirs('Testing')

basepath='C:\\aaa\\bbb\\257_ObjectCategories\\'
folders=os.listdir(basepath)

for folder in folders: # loop over the 257 folders
   contents = os.listdir(os.path.join(basepath, folder)) 
   random.shuffle(contents)  # shuffle the result
   split_point = round(0.8* len(contents))
   for img in contents[:split_point]:
       shutil.copy(os.path.join(os.path.join(basepath, folder), img), os.path.join("Training", img))
   for img in contents[split_point:]:
       shutil.copy(os.path.join(os.path.join(basepath, folder), img), os.path.join("Testing", img))

Upvotes: 2

Related Questions