lennin92
lennin92

Reputation: 513

Caffe Unknown bottom blob

I'm working with caffe framework and I would like to train the next network:

train2.prototxt

When I execute the next command:

caffe train --solver solver.prototxt

The error it throws:

`F0802 14:31:54.506695 28038 insert_splits.cpp:29] Unknown bottom blob 'image' (layer 'conv1', bottom index 0)
*** Check failure stack trace: ***
@     0x7ff2941c3f9d  google::LogMessage::Fail()
@     0x7ff2941c5e03  google::LogMessage::SendToLog()
@     0x7ff2941c3b2b  google::LogMessage::Flush()
@     0x7ff2941c67ee  google::LogMessageFatal::~LogMessageFatal()
@     0x7ff2947cedbe  caffe::InsertSplits()
@     0x7ff2948306de  caffe::Net<>::Init()
@     0x7ff294833a81  caffe::Net<>::Net()
@     0x7ff29480ce6a  caffe::Solver<>::InitTestNets()
@     0x7ff29480ee85  caffe::Solver<>::Init()
@     0x7ff29480f19a  caffe::Solver<>::Solver()
@     0x7ff2947f4343  caffe::Creator_SGDSolver<>()
@           0x40b1a0  (unknown)
@           0x407373  (unknown)
@     0x7ff292e40741  __libc_start_main
@           0x407b79  (unknown)
Abortado (`core' generado)

The code is (train2.prototxt):

name: "xxxxxx"
layer {
  name: "image"
  type: "HDF5Data"
  top: "image"
  top: "label"
  hdf5_data_param {
    source: "h5a.train.h5.txt"
    batch_size: 64
  }
  include {
    phase: TRAIN
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "image"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool1"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv3"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "improd3"
  type: "InnerProduct"
  bottom: "pool2"
  top: "improd3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "improd3"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "improd3"
  bottom: "label"
  top: "loss"
}

The solver.prototxt:

net: "train2.prototxt"
test_iter: 100
test_interval: 1000
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 20000
display: 20
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "caffe"
solver_mode: CPU

I'm stuck and i cant start the training of the network because this problem.

Upvotes: 0

Views: 2498

Answers (1)

Anoop K. Prabhu
Anoop K. Prabhu

Reputation: 5645

It is because, even if you are trying to execute the Train phase, the Test phase will also be used for validation. As there is no input data layer for the Test phase, the conv1 layer cannot find the input blob image. The Test phase is being called because you have defined test_* parameters in the solver and phase: TEST in some of the layers in train2.prototxt. Removing the above mentioned parameters from solver and the layers representing the TEST phase will help you run the training without any issues.

Upvotes: 2

Related Questions