ViniciusArruda
ViniciusArruda

Reputation: 990

How to make a copy of a trained model with tensorflow?

I have a class with a model specification and some methods to train and evaluate the model. I want to make a copy of an object that was trained, I tried with copy.deepcopy() but did not work.

The code below is just an example, but I want that works with any model using the same idea as below:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import sys
import copy
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
FLAGS = None

class Model():

    def __init__(self):
        self.x = tf.placeholder(tf.float32, [None, 784])
        self.W = tf.Variable(tf.zeros([784, 10]))
        self.b = tf.Variable(tf.zeros([10]))
        self.y = tf.matmul(self.x, self.W) + self.b
        self.y_ = tf.placeholder(tf.float32, [None, 10])
        self.cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=self.y_, logits=self.y))
        self.train_step = tf.train.GradientDescentOptimizer(0.5).minimize(self.cross_entropy)

    def train(self, mnist, sess):
        for _ in range(1000):
            batch_xs, batch_ys = mnist.train.next_batch(100)
            sess.run(self.train_step, feed_dict={self.x: batch_xs, self.y_: batch_ys})

    def test(self, mnist, sess):
        self.correct_prediction = tf.equal(tf.argmax(self.y, 1), tf.argmax(self.y_, 1))
        self.accuracy = tf.reduce_mean(tf.cast(self.correct_prediction, tf.float32))
        print(sess.run(self.accuracy, feed_dict={self.x: mnist.test.images, self.y_: mnist.test.labels}))

def main(_):
    # Import data
    mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
    m = Model()
    sess = tf.InteractiveSession()
    tf.global_variables_initializer().run()
    m.train(mnist, sess)
    copy_of_m = copy.deepcopy(m)  # DOES NOT WORK !
    m.test(mnist, sess)
    copy_of_m.test(mnist, sess)

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data', help='Directory for storing input data')
    FLAGS, unparsed = parser.parse_known_args()
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

Upvotes: 5

Views: 5815

Answers (2)

meTchaikovsky
meTchaikovsky

Reputation: 7666

As explained by de1 in the comment

TensorFlow variables exist in a graph and can't be serialised/desrialised on their own

You cannot simply copy a tensorflow model using deepcopy because the Variables live inside a graph. Although the Variables themselves cannot be copied (if you copy them you will receive this exception TypeError: can't pickle _thread.RLock objects), you can copy their values by using __getstate__/__setstate__. For example,

tf.reset_default_graph()

class Model():

    def __init__(self):
        
        self.normal = 2
        self.x = tf.ones([1,2])
        self.W = tf.Variable(tf.zeros([2, 2]))
        self.b = tf.Variable(tf.zeros([2]))
        self.y = tf.matmul(self.x, self.W) + self.b
        self.train_step = tf.train.GradientDescentOptimizer(0.5).minimize(self.y)
        self.inside_tf = ['W','b','x','y','train_step']
        
    def __getstate__(self):
        
        for item in self.inside_tf:
            setattr(self,'%s_val' % item,sess.run(getattr(self,item))) 
        state = self.__dict__.copy()
        for item in self.inside_tf:
            del state[item]
        return state

    def __setstate__(self, state):
        
        self.__dict__.update(state)

# Import data
m = Model()
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

copy_of_m = copy.deepcopy(m)

As you can see by running this script, before pickling (before copying), in the __getstate__ method, we first save the values of the Variables and then delete them from the copy of self.__dict__. Therefore, while pickling (copying), only the values of the Variables will be pickled.

By running [item for item in dir(copy_of_m) if item[:2] != '__'], you can see the object copy_of_m has attributes ['W_val', 'b_val', 'inside_tf', 'normal', 'train_step_val', 'x_val', 'y_val']. Although attributes like W_val are not tensorflow Variables, but clearly, the values of the Variables are the most important things to us.

Upvotes: 4

Pouria Nikvand
Pouria Nikvand

Reputation: 370

As in this thread Link you can use from copy import copy and do copy(model) instead of deep copy.

You can also use tf.keras.models.clone_model and load the other model's weight in your copy model.

Upvotes: 2

Related Questions