whitebear
whitebear

Reputation: 12433

Relationship between batch_size and data size

I have simple and basic question about batch_size

For example this simple RNN use 128 datasets.

        length_of_sequence = 3
        in_out_neurons = 5
        n_hidden = 128
        model = Sequential()
        model.add(LSTM(n_hidden, batch_input_shape=(None, length_of_sequence, in_out_neurons), return_sequences=True))
        model.add(Dense(in_out_neurons,activation="linear"))
        optimizer = Adam(lr=0.001)
        model.compile(loss="mean_squared_error", optimizer=optimizer)
        model.summary()
        train_x = np.zeros((128,3,5))
        train_y = np.zeros((128,1,5))
        model.fit(
            train_x,train_y,
            batch_size=30,
            epochs=10,
            validation_split=0.9
        )

This fit() shows these result.

However dataset is 128, and batch_size is 30 so, it must be like around 5/5 or 4/4, am I wrong ??

Somehow there comes 1/1.

Epoch 1/10
1/1 [==============================] - 2s 2s/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 2/10
1/1 [==============================] - 0s 33ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 3/10
1/1 [==============================] - 0s 32ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 4/10
1/1 [==============================] - 0s 33ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 5/10
1/1 [==============================] - 0s 46ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 6/10
1/1 [==============================] - 0s 34ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 7/10
1/1 [==============================] - 0s 34ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 8/10
1/1 [==============================] - 0s 38ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 9/10
1/1 [==============================] - 0s 28ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00
Epoch 10/10
1/1 [==============================] - 0s 26ms/step - loss: 0.0000e+00 - val_loss: 0.0000e+00

Upvotes: 0

Views: 92

Answers (1)

yakhyo
yakhyo

Reputation: 1656

The validation data is much bigger than train data: Total 128 data samples and 90% of it for validation so validation is around ~115 while train data has only ~13 samples. When you set the batch_size=30 then 13 images fit into the batch easily that's why shows only 1/1 for train and for validation set like 4/4.

The code should be changes as following to get 4/4 for training and 1/1 for validation:

model.fit(
     train_x,train_y,
     batch_size=30,
     epochs=10,
     validation_split=0.1
)

Upvotes: 1

Related Questions