Pro_gram_mer
Pro_gram_mer

Reputation: 817

why Keras fit_generator() load before actually "training"

sorry to bother:

I am confused by keras function : fit_generator

I use custom generator to generate (image,seg_image) for training

Look carefully you can see inside of get_seg() function

I put the print(path) ann the path is just the path to read image

from data ,the one more intention is I would like to know how

fit_generator() get the data from generator

#import all the stuff
def get_seg(#parameters ):
    print(path) #to track when this function is called 
    return seg_image     #for training 

#pre-processing image 
def getimage(#parameters):
    #do something to image
    return the imgage      #for training

def data_generator():
#load all the data for training 
    zipped =cycle(zip(images,segmentations))
    while True:
        X = []
        Y = []
        for _ in range(batch_size) :
            im , seg = next(zipped)
            X.append(getimage(#parameters))  
            Y.append(get_seg(#parameters))
    yield np.array(X) , np.array(Y) 

#create an generator 
G = data_generator(#parameters) 
#start training
for ep in range( epochs ):
    m.fit_generator( G , steps_per_epoch=512, 
                     epochs=1,workers=1)

While I start training, I get the really unexpected result ,As it goes

through training:The terminal looks like: it print out 24 set of path

fist which it take the data from custom data_generator

data/train/0000_mask.png
data/train/0001_mask.png
data/train/0002_mask.png
data/train/0003_mask.png
data/train/0004_mask.png
data/train/0005_mask.png
data/train/0006_mask.png
data/train/0007_mask.png
data/train/0008_mask.png
data/train/0009_mask.png
data/train/0010_mask.png
data/train/0011_mask.png
data/train/0012_mask.png
data/train/0013_mask.png
data/train/0014_mask.png
data/train/0015_mask.png
data/train/0016_mask.png
data/train/0017_mask.png
data/train/0018_mask.png
data/train/0019_mask.png
data/train/0020_mask.png
data/train/0021_mask.png
data/train/0022_mask.png
data/train/0023_mask.png

And then : I do believe the training starts here .

1/512 [..............................] - ETA: 2:14:34 - loss: 2.5879 - acc: 0.1697

load the data(image) again

data/train/0024_mask.png
data/train/0025_mask.png

After 512(steps_per_epoch) which means the next round training

begins,it would just print next 24 path before training ....

I would like to know why this is happening? Is this is how keras

works? To load data before actually pass is through the network?

Or I am misunderstand or miss something basic knowledge?

Upvotes: 3

Views: 645

Answers (1)

Daniel Möller
Daniel Möller

Reputation: 86600

Yes, this is how Keras works.

Training and loading are two parallel actions. One does not see how the other is going.

In the fit_generator method there is a max_queue_size argument, usually equal to 10 by default. This means the generator will load data at full speed until the queue is full. So you're loading many images in advance (this is good to avoid that the model gets slowed by loading)

And the training just checks: are there items in the queue? Good, train.

You're printing more than your batches because you call get_seg inside a loop but only call yield outside this loop.

Upvotes: 2

Related Questions