Sameer Lattannavar
Sameer Lattannavar

Reputation: 1

Tensorflow: checkpoint files are getting deleted at the end of training without any error message

I was following this link to train audio recognition model. https://www.tensorflow.org/versions/master/tutorials/audio_recognition

Started with the command:

python tensorflow/examples/speech_commands/train.py

End of the command shows :

INFO:tensorflow:Saving to "/media/slattann/HDD2/tmp/speech_commands_train/conv.ckpt-18000"

INFO:tensorflow:Confusion Matrix:

 [[257   0   0   0   0   0   0   0   0   0   0   0]
 [  0 194   4   3   4   7   8  16   5   0   7   9]
 [  0   4 239   1   3   0   6   2   0   0   1   0]
 [  1   4   0 216   1   6   3   4   1   0   3  13]
 [  0   0   0   0 257   0   2   1   3   1   6   2]
 [  2   7   0  13   3 213   2   0   1   0   1  11]
 [  0   2  11   0   5   0 246   3   0   0   0   0]
 [  1  11   0   1   1   1   0 240   1   2   1   0]
 [  1   7   0   0   1   0   1   2 234   0   0   0]
 [  1   4   0   0  19   0   1   2   9 222   3   1]
 [  0   2   1   0   7   1   1   0   1   0 235   1]
 [  0   6   0  33   5   3   6   2   0   0   2 194]]

INFO:tensorflow:Final test accuracy = 89.2% (N=3081)

but i could not find this checkpoint file. at the end I could only see:

----------------------------------------------------------------------------

Files I could see in the /media/slattann/HDD2/tmp/speech_commands_train folder:

checkpoint
conv.ckpt-17600.data-00000-of-00001
conv.ckpt-17600.index
conv.ckpt-17600.meta
conv.ckpt-17700.data-00000-of-00001
conv.ckpt-17700.index
conv.ckpt-17700.meta
conv.ckpt-17800.data-00000-of-00001
conv.ckpt-17800.index
conv.ckpt-17800.meta
conv.ckpt-17900.data-00000-of-00001
conv.ckpt-17900.index
conv.ckpt-17900.meta
conv.ckpt-18000.data-00000-of-00001
conv.ckpt-18000.index
conv.ckpt-18000.meta
conv_labels.txt
conv.pbtxt
----------------------------------------------------------------------------

I am blocked to proceed with freezing the graph stage before creating the tensorflow model(.pb).

enter image description here
1: https://i.sstatic.net/TIXZX.jpg

Upvotes: 0

Views: 557

Answers (1)

GPhilo
GPhilo

Reputation: 19123

When it says Saving to "/media/slattann/HDD2/tmp/speech_commands_train/conv.ckpt-18000" what it really means is that it creates (at least) three files with the same prefix:

  • conv.ckpt-18000.meta (containing the meta-graph definition)
  • conv.ckpt-18000.index
  • conv.ckpt-18000.data-00000-of-00001 (containing the actual variables' data, possibly sharded).

To freeze_graph just pass the prefix path/to/checkpoint/dir/conv.ckpt-18000 , it will handle the rest on its own.

For more informatoin about why there are three files in the first place, see this question

Upvotes: 1

Related Questions