Outcast
Outcast

Reputation: 5117

Import files in batches from a directory to a python script

I want to import to my python script all the jpg images from a specific directory however not all at once but 500 images at each time.

One possible solution is the following:

from glob import glob

i = 0
batch= 500
# Read images from file
for filename in glob('Directory/*.jpg'):
    i = i + 1
    if i % batch == 0:
        # apply an algorithm with this data-batch #

Is this correct?

Is there any more efficient way to do this?

Upvotes: 2

Views: 2138

Answers (2)

Cong Ma
Cong Ma

Reputation: 11322

batch_size = 500
filenames = glob(...)    # fill with your own details
nfiles = len(filenames)
nbatches, remainder = divmod(nfiles, batch_size)
for i in xrange(nbatches):   # or range() for Python 3
    batch = filenames[batch_size * i:batch_size * (i + 1)]
    do_something_with(batch)
if remainder:
    do_something_with(filenames[batch_size * nbatches:])

A version that uses a generator to take every N elements from a possibly non-ending iterable:

def every(thing, n):
    """every(ABCDEFG, 2) --> AB CD EF G"""                                      
    toexit = False
    it = iter(thing)
    while not toexit:
        batch = []
        for i in xrange(n):
            try:
                batch.append(it.next())
            except StopIteration:
                toexit = True
        if not batch:
            break
        yield batch


filenames_i = glob.iglob("...")
for batch in every(filenames_i, 500):
    do_something_with(batch)

This would make the iteration over the batches themselves more concise (the for batch in every() in this code snippet).

Upvotes: 1

RandomBob
RandomBob

Reputation: 126

from os import listdir

directory = 'Directory/*.jpg'

fnames = list(fname for fname in listdir(directory) if fname.endswith('.jpg'))

batchsize = 500
l_index, r_index = 0, batchsize
batch = fnames[l_index:r_index]

while batch:

    for i in batch:
        import_function(i) 
    l_index, r_index = r_index, r_index + batchsize
    batch = fnames[l_index:r_index]

Upvotes: 3

Related Questions