Reputation: 566
I am trying to create a dataset for audio recognition with a simple Keras sequential model.
This is the function I am using to create the model:
def dnn_model(input_shape, output_shape):
model = keras.Sequential()
model.add(keras.Input(input_shape))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation = "relu"))
model.add(layers.Dense(output_shape, activation = "softmax"))
model.compile( optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['acc'])
model.summary()
return model
And I am Generating my trainingsdata with this Generator function:
def generator(x_dirs, y_dirs, hmm, sampling_rate, parameters):
window_size_samples = tools.sec_to_samples(parameters['window_size'], sampling_rate)
window_size_samples = 2**tools.next_pow2(window_size_samples)
hop_size_samples = tools.sec_to_samples(parameters['hop_size'],sampling_rate)
for i in range(len(x_dirs)):
features = fe.compute_features_with_context(x_dirs[i],**parameters)
praat = tools.praat_file_to_target( y_dirs[i],
sampling_rate,
window_size_samples,
hop_size_samples,
hmm)
yield features,praat
The variables x_dirs
and y_dirs
contain a list of paths to labels and audiofiles. In total I got 8623 files to train my model. This is how I train my model:
def train_model(model, model_dir, x_dirs, y_dirs, hmm, sampling_rate, parameters, steps_per_epoch=10,epochs=10):
model.fit((generator(x_dirs, y_dirs, hmm, sampling_rate, parameters)),
epochs=epochs,
batch_size=steps_per_epoch)
return model
My problem now is that if I pass all 8623 files it will use all 8623 files to train the model in the first epoch and complain after the first epoch that it needs steps_per_epoch * epochs
batches to train the model.
I tested this with only 10 of the 8623 files with a sliced list, but than Tensorflow complains that there are needed 100 batches.
So how do I have my Generator yield out data that its works best? I always thought that steps_per_epoch
just limits the data received per epoch.
Upvotes: 0
Views: 90
Reputation: 2453
The fit function is going to exhaust your generator, that is to say, once it will have yielded all your 8623 batches, it wont be able to yield batches anymore.
You want to solve the issue like this:
def generator(x_dirs, y_dirs, hmm, sampling_rate, parameters, epochs=1):
for epoch in range(epochs): # or while True:
window_size_samples = tools.sec_to_samples(parameters['window_size'], sampling_rate)
window_size_samples = 2**tools.next_pow2(window_size_samples)
hop_size_samples = tools.sec_to_samples(parameters['hop_size'],sampling_rate)
for i in range(len(x_dirs)):
features = fe.compute_features_with_context(x_dirs[i],**parameters)
praat = tools.praat_file_to_target( y_dirs[i],
sampling_rate,
window_size_samples,
hop_size_samples,
hmm)
yield features,praat
Upvotes: 1