Reputation: 858
I needed a way to pull 10% of the files in a folder, at random, for sampling after every "run." Luckily, my current files are numbered numerically, and sequentially. So my current method is to list file names, parse the numerical portion, pull max and min values, count the number of files and multiply by .1, then use random.sample
to get a "random [10%] sample." I also write these names to a .txt then use shutil.copy
to move the actual files.
Obviously, this does not work if I have an outlier, i.e. if I have a file 345.txt
among other files from 513.txt - 678.txt
. I was wondering if there was a direct way to simply pull a number of files from a folder, randomly? I have looked it up and cannot find a better method.
Thanks.
Upvotes: 6
Views: 11756
Reputation: 2361
Based on Karl's solution (which did not work for me under Win 10, Python 3.x), I came up with this:
import numpy as np
import os
# List all files in dir
files = os.listdir("C:/Users/.../Myfiles")
# Select 0.5 of the files randomly
random_files = np.random.choice(files, int(len(files)*.5))
# Get the remaining files
other_files = [x for x in files if x not in random_files]
# Do something with the files
for x in random_files:
print(x)
Upvotes: 0
Reputation: 858
I was unable to get the other methods to work easily with my code, but I came up with this.
output_folder = 'C:/path/to/folder'
for x in range(int(len(files) *.1)):
to_copy = choice(files)
shutil.copy(os.path.join(subdir, to_copy), output_folder)
Upvotes: 2
Reputation: 2849
Using numpy.random.choice(array, N)
you can select N
items at random from an array.
import numpy as np
import os
# list all files in dir
files = [f for f in os.listdir('.') if os.path.isfile(f)]
# select 0.1 of the files randomly
random_files = np.random.choice(files, int(len(files)*.1))
Upvotes: 11
Reputation: 118
You can use following strategy:
list = os.listdir(path)
to get all your files in the directory as list of paths.range = len(list)
function.range
number you can get random item number like that random_position = random.randrange(1, range)
list[random_position]
Use cycle for
for iterating.
Hope this helps!
Upvotes: 0
Reputation: 522
This will give you the list of names in the folder with mypath being the path to the folder.
from os import listdir
from os.path import isfile, join
from random import shuffle
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
shuffled = shuffle(onlyfiles)
small_list = shuffled[:len(shuffled)/10]
This should work
Upvotes: 2