Reputation: 513
I'm working with caffe framework and I would like to train the next network:
When I execute the next command:
caffe train --solver solver.prototxt
The error it throws:
`F0802 14:31:54.506695 28038 insert_splits.cpp:29] Unknown bottom blob 'image' (layer 'conv1', bottom index 0)
*** Check failure stack trace: ***
@ 0x7ff2941c3f9d google::LogMessage::Fail()
@ 0x7ff2941c5e03 google::LogMessage::SendToLog()
@ 0x7ff2941c3b2b google::LogMessage::Flush()
@ 0x7ff2941c67ee google::LogMessageFatal::~LogMessageFatal()
@ 0x7ff2947cedbe caffe::InsertSplits()
@ 0x7ff2948306de caffe::Net<>::Init()
@ 0x7ff294833a81 caffe::Net<>::Net()
@ 0x7ff29480ce6a caffe::Solver<>::InitTestNets()
@ 0x7ff29480ee85 caffe::Solver<>::Init()
@ 0x7ff29480f19a caffe::Solver<>::Solver()
@ 0x7ff2947f4343 caffe::Creator_SGDSolver<>()
@ 0x40b1a0 (unknown)
@ 0x407373 (unknown)
@ 0x7ff292e40741 __libc_start_main
@ 0x407b79 (unknown)
Abortado (`core' generado)
The code is (train2.prototxt):
name: "xxxxxx"
layer {
name: "image"
type: "HDF5Data"
top: "image"
top: "label"
hdf5_data_param {
source: "h5a.train.h5.txt"
batch_size: 64
}
include {
phase: TRAIN
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "image"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool1"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv3"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "improd3"
type: "InnerProduct"
bottom: "pool2"
top: "improd3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1000
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "improd3"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "improd3"
bottom: "label"
top: "loss"
}
The solver.prototxt:
net: "train2.prototxt"
test_iter: 100
test_interval: 1000
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 20000
display: 20
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "caffe"
solver_mode: CPU
I'm stuck and i cant start the training of the network because this problem.
Upvotes: 0
Views: 2498
Reputation: 5645
It is because, even if you are trying to execute the Train
phase, the Test
phase will also be used for validation. As there is no input data layer for the Test phase, the conv1
layer cannot find the input blob image
. The Test
phase is being called because you have defined test_*
parameters in the solver and phase: TEST
in some of the layers in train2.prototxt. Removing the above mentioned parameters from solver and the layers representing the TEST
phase will help you run the training without any issues.
Upvotes: 2