Dr Sokoban
Dr Sokoban

Reputation: 1638

Keras: Generator runs out of data when starting second epoch

I have the following generator:

def customGenerator(generator, indexes):

    for i in indexes:
        x,y = generator[i]
        yield (np.squeeze(x), 
                {'outputsA': y[:,4:6], 'outputsB': y[:,11:], 
                'outputsC': y[:,10]} )

and then lines to train the model (I am omitting some lines that are unrelated to the problem):

randomize = np.arange( len(generator) )
np.random.shuffle(randomize)
trainLimit = int( 0.9*len(generator) )

model.fit(x = customGenerator(generator, randomize[:trainLimit]), y = None,
    validation_data = customGenerator(generator, randomize[trainLimit:]),
    epochs=1000, steps_per_epoch = trainLimit)

Setting steps_per_epoch to None (or just removing this argument) produces the same error.

This code works well during the first epoch, but then when starting the second epoch it says it ran out of data:

Epoch 1/1000                                                                                                                                                                                                                                 
2534/2534 [==============================] - 1124s 443ms/step - loss: 20.3274 - outputsA_loss: 8.2611 - outputsB_loss: 11.8572 - outputsC_loss: 0.2091 - val_loss: 11.4947 - val_outputsA_loss
: 3.3958 - val_outputsB_loss: 7.9044 - val_outputsC_loss: 0.1945                                                                                                                              
Epoch 2/1000                                                                                                                                                                                  
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 2534000
 batches). You may need to use the repeat() function when building your dataset. 

This warning is not just a warning, it stops the execution completely.

it seems that it only runs through the generator once, while I thought it would re-start the generator each epoch.

I don't really know how to do this.

I could create an input array which is the original data repeated 1000 times, but this would use a lot of memory, and there has to be a way to tell it to re-start the generator in every iteration, but I don't know how.

Upvotes: 1

Views: 1180

Answers (1)

Feodoran
Feodoran

Reputation: 1822

The generator stops at the end of the for loop. To simply repeat the data, wrap the for loop in a while loop:

def customGenerator(generator, indexes):

    while True:

        indexes = np.random.shuffle(indexes) # reshuffle every new epoch

        for i in indexes:
            x,y = generator[i]
            yield (np.squeeze(x), 
                    {'outputsA': y[:,4:6], 'outputsB': y[:,11:], 
                    'outputsC': y[:,10]} )

Upvotes: 4

Related Questions