craft
craft

Reputation: 555

Save n previous weights of training

I wanted to write a custom layer in Tensorflow in which I would need to save previous weights of the layer (so for each of the last n epoch). Where in the custom layer would I do that?

For example this is how a custom layer would look like from tensorflow examples:

class MyDenseLayer(tf.keras.layers.Layer):
  def __init__(self, num_outputs):
    super(MyDenseLayer, self).__init__()
    self.num_outputs = num_outputs

  def build(self, input_shape):
    self.kernel = self.add_weight("kernel",
                                  shape=[int(input_shape[-1]),
                                         self.num_outputs])

  def call(self, input):
    return tf.matmul(input, self.kernel)

Upvotes: 2

Views: 801

Answers (2)

Here are my approaches:

1- Using ModelCheckpoint callback

Here I create a feedforward neural network. And then, I use tf.keras.callbacks.ModelCheckpoint for saving the model on each epoch. Finally, I load a saved model and access its weights.

-First, Let's create a simple feedforward neural network(of course, you can use any other layer or model) :

import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras import  Sequential
from tensorflow.keras.callbacks import ModelCheckpoint
from sklearn.datasets import make_blobs
from tensorflow.keras.utils import to_categorical


model = Sequential()

model.add(Dense(units=3 , input_dim=5 , activation='relu', name='Dense_1'))
model.add(Dense(units=2 , activation='softmax', name='Dense_2'))

model.summary()

-Here is the summary output:

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Dense_1 (Dense)              (None, 3)                 18        
_________________________________________________________________
Dense_2 (Dense)              (None, 2)                 8         
=================================================================
Total params: 26
Trainable params: 26
Non-trainable params: 0
_________________________________________________________________

-Creating dummy data for training the model:

train_x , train_y = make_blobs(n_samples=1000, centers=2, n_features=5)
train_y = to_categorical(train_y,2)

-Compile and train the model:

LOG_DIRECTORY = './stackoverflow/'

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model_checkpoint = ModelCheckpoint(LOG_DIRECTORY+'weights{epoch:03d}.h5', 
                                     save_freq='epoch',
                                     verbose=1)

model.fit(train_x, train_y,
          batch_size=32,
          epochs=100,
          verbose=1,
          callbacks=[model_checkpoint]
          )

-Let's see some of the fit outputs:

Epoch 1/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000

Epoch 00001: saving model to ./stackoverflow/weights001.h5
Epoch 2/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000

Epoch 00002: saving model to ./stackoverflow/weights002.h5
Epoch 3/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000

Epoch 00003: saving model to ./stackoverflow/weights003.h5
Epoch 4/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000

Epoch 00004: saving model to ./stackoverflow/weights004.h5

As you see in the above output, the model is saved after each epoch by using model_checkpoint = ModelCheckpoint ....

-Loading a saved model:

from tensorflow.keras.models import load_model

saved_model =load_model(LOG_DIRECTORY+'weights097.h5') 

-Printing a layer's weight:

print(saved_model.layers[0].weights)

In your case, use the index of your desired layer.

-Output:

The output is a list of tf.Variable of weights and biases

[<tf.Variable 'Dense_1/kernel:0' shape=(5, 3) dtype=float32, numpy=
 array([[ 0.44274166, -0.46638554, -0.40543374],
        [-0.81307524, -0.43660507, -0.51048666],
        [-0.69864446,  0.37800577, -0.06189097],
        [-0.12871675,  0.36555207,  0.6326951 ],
        [ 0.13829602,  0.56905323,  0.09383805]], dtype=float32)>,
 <tf.Variable 'Dense_1/bias:0' shape=(3,) dtype=float32, numpy=array([-0.02371155, -0.06548308,  0.17505823], dtype=float32)>]

-If you wanna have them in the numpy.array form:

print(saved_model.layers[0].kernel.numpy())
print(saved_model.layers[0].bias.numpy())

-Output:

array([[ 0.44274166, -0.46638554, -0.40543374],
       [-0.81307524, -0.43660507, -0.51048666],
       [-0.69864446,  0.37800577, -0.06189097],
       [-0.12871675,  0.36555207,  0.6326951 ],
       [ 0.13829602,  0.56905323,  0.09383805]], dtype=float32)

array([-0.02371155, -0.06548308,  0.17505823], dtype=float32)

In this case, we save the whole model. But if you want to save only one layer's weights you can create a custom callback for that.

2- Using a custom callback

I create a callback by inheriting from tensorflow.keras.callbacks.Callback. The only thing that it does for now is that it prints the weights of a layer and you can also add the code for saving that layer's weights with pickle or numpy.

from tensorflow.keras.callbacks import Callback
import pickle
import numpy as np 


class CustomCallback(Callback):

    def __init__(self, save_path='./logDir', layer_index = 0):
      self.save_path = save_path
      self.layer_index = layer_index

    def on_epoch_end(self, epoch, logs=None):
        
        # access the model weihts
        weights_of_first_layer = self.model.layers[self.layer_index].weights

        # Do some printing
        print(f'\n\nIn the custom callback, Epoch {epoch}: ')
        print(f'First layer weights: \n{weights_of_first_layer}')
        print('\n\n')

        # get weights in the numpy array format
        weights = self.model.layers[self.layer_index].kernel.numpy()
        biases = self.model.layers[self.layer_index].bias.numpy() 


        # Now here you can use numpy or pickle to save the weights

        #using  pickle 
        #pickle.dump() 

        # using numpy 
        # np.save()

-Let's see this callback in action:

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

custom_callback = CustomCallback()

model.fit(train_x, train_y,
          batch_size=32,
          epochs=100,
          verbose=1,
          callbacks=[custom_callback]
          )

-Some of the fit outputs:

Epoch 1/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000


In the custom callback, Epoch 0: 
First layer weights: 
[<tf.Variable 'Dense_1/kernel:0' shape=(5, 3) dtype=float32, numpy=
array([[ 0.270913  , -0.52936906, -0.703977  ],
       [-0.9254448 , -0.4501195 , -0.6954986 ],
       [-0.84005284,  0.34274203, -0.3004068 ],
       [-0.24770388,  0.34638566,  0.43633664],
       [ 0.2538888 ,  0.5864706 ,  0.28424913]], dtype=float32)>, <tf.Variable 'Dense_1/bias:0' shape=(3,) dtype=float32, numpy=array([ 0.02709604, -0.07876156,  0.2504825 ], dtype=float32)>]



Epoch 2/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000


In the custom callback, Epoch 1: 
First layer weights: 
[<tf.Variable 'Dense_1/kernel:0' shape=(5, 3) dtype=float32, numpy=
array([[ 0.27090982, -0.52937293, -0.703984  ],
       [-0.92544633, -0.4501213 , -0.6955019 ],
       [-0.84005505,  0.3427393 , -0.3004116 ],
       [-0.24770552,  0.3463836 ,  0.43633294],
       [ 0.25389025,  0.5864727 ,  0.28425238]], dtype=float32)>, <tf.Variable 'Dense_1/bias:0' shape=(3,) dtype=float32, numpy=array([ 0.02709637, -0.07876115,  0.2504833 ], dtype=float32)>]



Epoch 3/100
32/32 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 1.0000

Some useful docs:

Upvotes: 4

Andrey
Andrey

Reputation: 6387

call() method is called for every batch.

Add a counter within __init__() and save within call():

class MyDenseLayer(tf.keras.layers.Layer):
  def __init__(self, num_outputs):
    super(MyDenseLayer, self).__init__()
    self.num_outputs = num_outputs
    self.counter = 0

  def call(self, input):
    self.counter += 1
    if self.counter % batches_per_epoch == 0:
      # add saving here
    return tf.matmul(input, self.kernel)

Upvotes: 2

Related Questions