Why building same model in 2 different ways give different outputs?

Question

I'm having a really weird problem.

I'm building same model in 2 different ways.
I checked the summary (number of parameters) and plot the 2 models, and see no difference.
The models give different predictions (after train them on same dataset).

What is the difference in the models ? (I can't figure it out)
How can I update the second model to be same as the first model ?

first model (the "source" model):

    input_img = Input(shape=(dim_x, dim_y, dim_z))

    x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    encoder = MaxPooling2D((2, 2), padding='same')(x)




    x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoder)
    x = UpSampling2D((2, 2))(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    x = UpSampling2D((2, 2))(x)
    x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
    x = UpSampling2D((2, 2))(x)
    decoder = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)

    autoencoder = Model(input_img, decoder)
    autoencoder.compile(optimizer='adam', loss=loss_func)     Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
conv2d_28 (Conv2D)           (None, 224, 224, 16)      448       
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 112, 112, 16)      0         
_________________________________________________________________
conv2d_29 (Conv2D)           (None, 112, 112, 8)       1160      
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 56, 56, 8)         0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 56, 56, 8)         584       
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 28, 28, 8)         0         
_________________________________________________________________
conv2d_31 (Conv2D)           (None, 28, 28, 8)         584       
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 56, 56, 8)         0         
_________________________________________________________________
conv2d_32 (Conv2D)           (None, 56, 56, 8)         584       
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 112, 112, 8)       0         
_________________________________________________________________
conv2d_33 (Conv2D)           (None, 112, 112, 16)      1168      
_________________________________________________________________
up_sampling2d_14 (UpSampling (None, 224, 224, 16)      0         
_________________________________________________________________
conv2d_34 (Conv2D)           (None, 224, 224, 3)       435       
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0

summary:

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_21 (Conv2D)           (None, 224, 224, 16)      448       
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 112, 112, 16)      0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 112, 112, 8)       1160      
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 56, 56, 8)         0         
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 56, 56, 8)         584       
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 28, 28, 8)         0         
_________________________________________________________________
conv2d_24 (Conv2D)           (None, 28, 28, 8)         584       
_________________________________________________________________
up_sampling2d_9 (UpSampling2 (None, 56, 56, 8)         0         
_________________________________________________________________
conv2d_25 (Conv2D)           (None, 56, 56, 8)         584       
_________________________________________________________________
up_sampling2d_10 (UpSampling (None, 112, 112, 8)       0         
_________________________________________________________________
conv2d_26 (Conv2D)           (None, 112, 112, 16)      1168      
_________________________________________________________________
up_sampling2d_11 (UpSampling (None, 224, 224, 16)      0         
_________________________________________________________________
conv2d_27 (Conv2D)           (None, 224, 224, 3)       435       
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0

Second model (The model I want to build as first model in different way):

    autoencoder = Sequential()
    autoencoder.add(el1)
    autoencoder.add(el2)
    autoencoder.add(el3)
    autoencoder.add(el4)
    autoencoder.add(el5)
    autoencoder.add(el6)
    autoencoder.add(dl1)
    autoencoder.add(dl2)
    autoencoder.add(dl3)
    autoencoder.add(dl4)
    autoencoder.add(dl5)
    autoencoder.add(dl6)
    autoencoder.add(output_layer)
    autoencoder.compile(optimizer='adam', loss=loss_func)

summary:

 Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
conv2d_28 (Conv2D)           (None, 224, 224, 16)      448       
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 112, 112, 16)      0         
_________________________________________________________________
conv2d_29 (Conv2D)           (None, 112, 112, 8)       1160      
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 56, 56, 8)         0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 56, 56, 8)         584       
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 28, 28, 8)         0         
_________________________________________________________________
conv2d_31 (Conv2D)           (None, 28, 28, 8)         584       
_________________________________________________________________
up_sampling2d_12 (UpSampling (None, 56, 56, 8)         0         
_________________________________________________________________
conv2d_32 (Conv2D)           (None, 56, 56, 8)         584       
_________________________________________________________________
up_sampling2d_13 (UpSampling (None, 112, 112, 8)       0         
_________________________________________________________________
conv2d_33 (Conv2D)           (None, 112, 112, 16)      1168      
_________________________________________________________________
up_sampling2d_14 (UpSampling (None, 224, 224, 16)      0         
_________________________________________________________________
conv2d_34 (Conv2D)           (None, 224, 224, 3)       435       
=================================================================
Total params: 4,963
Trainable params: 4,963
Non-trainable params: 0

Mateusz Dorobek · Accepted Answer

You should set a random seed using tensorflow.set_random_seed(0) and numpy.random.seed(0). The seed can be any int or 1D array_like, and should be set in your code once.

Also make sure that you have shuffling disabled model.fit(data, shuffle=False)

After that a random weight/parameters initialization and data ordering will be reproduceable in consecutive experiments and models.

Although there still may be some randomness resulting in different results after running the model. It can be from other libraries that use other randomness modules. (eg.: mnist_cnn.py does not give reproducible results)

Why building same model in 2 different ways give different outputs?

Answers (1)

Related Questions