duhaime
duhaime

Reputation: 27594

Keras: Many batch sizes fail

I am working on generalizing the inputs to the sample variational autoencoder in the Keras repository, but seem to have made some elementary mistakes. In particular, only certain batch sizes work for the model below:

from keras.layers import Lambda, Input, Dense, Reshape
from keras.models import Model
from keras.losses import mse
from keras import backend as K
import numpy as np

# reparameterization trick
# instead of sampling from Q(z|X), sample epsilon = N(0,I)
# z = z_mean + sqrt(var) * epsilon
def sampling(args):
  z_mean, z_log_var = args
  batch = K.shape(z_mean)[0]
  dim = K.int_shape(z_mean)[1]
  # by default, random_normal has mean = 0 and std = 1.0
  epsilon = K.random_normal(shape=(batch, dim))
  return z_mean + K.exp(0.5 * z_log_var) * epsilon

# network parameters
original_dim = 45
input_shape = (original_dim, )
intermediate_dim = 512
latent_dim = 2

# VAE model = encoder + decoder
# build encoder model
inputs = Input(shape=input_shape, name='encoder_input')
x = Reshape((original_dim,))(inputs)
x = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim, name='z_mean')(x)
z_log_var = Dense(latent_dim, name='z_log_var')(x)
z = Lambda(sampling, output_shape=(latent_dim,), name='z')([z_mean, z_log_var])
encoder = Model(inputs, [z_mean, z_log_var, z], name='encoder')

# build decoder model
latent_inputs = Input(shape=(latent_dim,), name='z_sampling')
x = Dense(intermediate_dim, activation='relu')(latent_inputs)
x = Dense(original_dim, activation='sigmoid')(x)
outputs = Reshape(input_shape)(x)
decoder = Model(latent_inputs, outputs, name='decoder')

# instantiate VAE model
outputs = decoder(encoder(inputs)[2])
vae = Model(inputs, outputs, name='vae_mlp')
vae.add_loss(mse(inputs, outputs))
vae.compile(optimizer='adam')

x_train = np.random.rand(1000, 45)
vae.fit(x_train, epochs=100, batch_size=10) # works, while 23 fails

Can anyone help me understand why some batch sizes fail (e.g. 23)? I'd be grateful for any insights others can offer on this question.

Upvotes: 1

Views: 180

Answers (1)

a-doering
a-doering

Reputation: 1179

You currently have unequal batch sizes if data%batch_size != 0.You can solve your problem by changing your code to:

x_train = np.random.rand(1000, 45)
batch_size = 23
vae.fit(x_train, epochs=100, steps_per_epoch = x_train.size//batch_size)

This results in all batches having the same size, here is the documentation of fit with its attributes.

Upvotes: 1

Related Questions