Reputation: 82
When trying to load the saved weights over multiple epochs of a trained networks in returnn using the following code:
import tensorflow as tf
from returnn.Config import Config
from returnn.TFNetwork import TFNetwork
for i in range(1,11):
modelFilePath = path/to/model/ + 'network.' + '%03d' % (i,)
returnnConfig = Config()
returnnConfig.load_file(path/to/configFile)
returnnTfNetwork = TFNetwork(config=path/to/configFile, train_flag=False, eval_flag=True)
returnnTfNetwork.construct_from_dict(returnnConfig.typed_value('network'))
with tf.Session() as sess:
returnnTfNetwork.load_params_from_file(modelFilePath, sess)
I get the following error:
Variables to restore which are not in checkpoint:
global_step_1
Variables in checkpoint which are not needed for restore:
global_step
Probably we can restore these:
(None)
Error, some entry is missing in the checkpoint
Upvotes: 1
Views: 157
Reputation: 68110
The problem is that you recreate TFNetwork
every time in the loop, and there, also a new variable is created every time for the global step, which must be called different because every variable must have a unique name.
You could do something like this inside the loop:
tf.reset_default_graph()
Upvotes: 1