magneto
magneto

Reputation: 107

Regression in Caffe: Prediction is highly erroneous

I've been doing a single label regression problem in Caffe. The input contains 5 hdf5 files which I have generated independently using different images. I first tested my network with a single hdf5 file and ran 10000 iterations with about 800 training images(batch size 64). Finally, when I did prediction on the same training images, I got the result as follows:

enter image description here

But on the testing images it was:

enter image description here

Which as far as I understand is due to the less amount of training data and that the test data is not quite similar to training data.

So,I tried increasing the training data to about 5500 images, dividing them into 5 hdf5 files. And the prediction output on the training data using a model created using 14,000 iterations is:

enter image description here

I do not understand why the prediction is worse? How does caffe select a batch? (my batch size is 64) Does it select a batch at random from the 5 hdf5 files? What might be the reason behind my bad prediction? And what can I do to train my model effectively? Should I add more convolutional layers? Any suggestions will be extremely life-saving. This is my first attempt in neural networks and caffe. My network is:

name: "Regression"
layer{
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "train_hdf5file.txt"
    batch_size: 64
    shuffle: true
  }
  include: { phase: TRAIN }
}
layer{
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "test_hdf5file.txt"
    batch_size: 30
  }
  include: { phase: TEST }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param { lr_mult: 1 }
  param { lr_mult: 2 }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "dropout1"
  type: "Dropout"
  bottom: "pool1"
  top: "pool1"
  dropout_param {
    dropout_ratio: 0.1
  }
}

layer{
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  param { lr_mult: 1 decay_mult: 1 }
  param { lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "dropout2"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer{
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param { lr_mult: 1 decay_mult: 1 }
  param { lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
 }
}
layer{
  name: "loss"
  type: "EuclideanLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}

Upvotes: 1

Views: 286

Answers (1)

Roger Trullo
Roger Trullo

Reputation: 1584

Try adding convolutional layers, and remove the dropout (then you can use it if you are having overfitting problems). Additionally you have to check the loss that is printed by Caffe during training; based on that you might need to change also the learning rate, etc, in the solver file.

Upvotes: 1

Related Questions