Smarty77
Smarty77

Reputation: 1348

ERROR:tensorflow:Couldn't match files for checkpoint

I am training a tensorflow model, after each epoch I save model state and pickle some arrays. So far my model did 2 epochs and folder with saved states contains following files:

checkpoint
model_e_knihy_preprocessed.txt_e0.ckpt-1134759.data-00000-of-00001
model_e_knihy_preprocessed.txt_e0.ckpt-1134759.index
model_e_knihy_preprocessed.txt_e0.ckpt-1134759.meta
model_e_knihy_preprocessed.txt_e1.ckpt-2269536.data-00000-of-00001
model_e_knihy_preprocessed.txt_e1.ckpt-2269536.index
model_e_knihy_preprocessed.txt_e1.ckpt-2269536.meta
topgrads_e_knihy_preprocessed.txt_[it0].pkl
topgrads_e_knihy_preprocessed.txt_[it1].pkl
toppositions_e_knihy_preprocessed.txt_[it0].pkl
toppositions_e_knihy_preprocessed.txt_[it1].pkl
vocab.txt

I did not moved the folder, or did any external modifications to file structure. checkpoint file contains following content:

model_checkpoint_path: "model_e_knihy_preprocessed.txt_e1.ckpt-2269536"
all_model_checkpoint_paths: "model_e_knihy_preprocessed.txt_e0.ckpt-1134759"
all_model_checkpoint_paths: "model_e_knihy_preprocessed.txt_e1.ckpt-2269536"

I restore models in following way

with tf.Session() as session:
    model = Word2Vec(opts, session)
    model.saver.restore(session, tf.train.latest_checkpoint(path_to_model))

but there is already error in tf.train.latest_checkpoint(path_to_model) method.

ERROR:tensorflow:Couldn't match files for checkpoint /mnt/minerva1/nlp/projects/deep_learning/word2vec/trainedmodels/tf_w2vopt_[CS]ebooks_topgradients_iterative/model_e_knihy_preprocessed.txt_e1.ckpt-2269536

So I peeked into the method

def latest_checkpoint(checkpoint_dir, latest_filename=None):
  ckpt = get_checkpoint_state(checkpoint_dir, latest_filename)
  if ckpt and ckpt.model_checkpoint_path:
    # Look for either a V2 path or a V1 path, with priority for V2.
    v2_path = _prefix_to_checkpoint_path(ckpt.model_checkpoint_path,
                                         saver_pb2.SaverDef.V2)
    v1_path = _prefix_to_checkpoint_path(ckpt.model_checkpoint_path,
                                         saver_pb2.SaverDef.V1)
    if file_io.get_matching_files(v2_path) or file_io.get_matching_files(
        v1_path):
      return ckpt.model_checkpoint_path
    else:
      logging.error("Couldn't match files for checkpoint %s",
                    ckpt.model_checkpoint_path)
  return None

and found out that file_io.get_matching_files(v2_path) finds nothing (v2_path contains value /mnt/minerva1/nlp/projects/deep_learning/word2vec/trainedmodels/tf_w2vopt_[CS]ebooks_topgradients_iterative/model_e_knihy_preprocessed.txt_e1.ckpt-2269536.index which is present in the folder! Sadly I could not follow much further, since this method's control leads into tensorflow wrapper. Is this a tensorflow bug?

I am using Tensorflow version 1.5.0-rc0.

Upvotes: 3

Views: 2096

Answers (1)

Smarty77
Smarty77

Reputation: 1348

So, the answer is DO NOT USE SQUARE BRACKETS IN YOUR FILE PATH. Tensorflow can't handle them. See https://github.com/tensorflow/tensorflow/issues/6082#issuecomment-265055615.

Upvotes: 4

Related Questions