Reputation: 107
I've been doing a single label regression problem in Caffe. The input contains 5 hdf5 files which I have generated independently using different images. I first tested my network with a single hdf5 file and ran 10000 iterations with about 800 training images(batch size 64). Finally, when I did prediction on the same training images, I got the result as follows:
But on the testing images it was:
Which as far as I understand is due to the less amount of training data and that the test data is not quite similar to training data.
So,I tried increasing the training data to about 5500 images, dividing them into 5 hdf5 files. And the prediction output on the training data using a model created using 14,000 iterations is:
I do not understand why the prediction is worse? How does caffe select a batch? (my batch size is 64) Does it select a batch at random from the 5 hdf5 files? What might be the reason behind my bad prediction? And what can I do to train my model effectively? Should I add more convolutional layers? Any suggestions will be extremely life-saving. This is my first attempt in neural networks and caffe. My network is:
name: "Regression"
layer{
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "train_hdf5file.txt"
batch_size: 64
shuffle: true
}
include: { phase: TRAIN }
}
layer{
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "test_hdf5file.txt"
batch_size: 30
}
include: { phase: TEST }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param { lr_mult: 1 }
param { lr_mult: 2 }
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "dropout1"
type: "Dropout"
bottom: "pool1"
top: "pool1"
dropout_param {
dropout_ratio: 0.1
}
}
layer{
name: "fc1"
type: "InnerProduct"
bottom: "pool1"
top: "fc1"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 2 decay_mult: 0 }
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "dropout2"
type: "Dropout"
bottom: "fc1"
top: "fc1"
dropout_param {
dropout_ratio: 0.5
}
}
layer{
name: "fc2"
type: "InnerProduct"
bottom: "fc1"
top: "fc2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 2 decay_mult: 0 }
inner_product_param {
num_output: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer{
name: "loss"
type: "EuclideanLoss"
bottom: "fc2"
bottom: "label"
top: "loss"
}
Upvotes: 1
Views: 286
Reputation: 1584
Try adding convolutional layers, and remove the dropout (then you can use it if you are having overfitting problems). Additionally you have to check the loss that is printed by Caffe during training; based on that you might need to change also the learning rate, etc, in the solver file.
Upvotes: 1