Reputation: 109
This is the result I get when I train my own model
I0510 20:53:16.677439 3591 solver.cpp:337] Iteration 0, Testing net (#0)
I0510 20:57:20.822933 3591 solver.cpp:404] Test net output #0: accuracy = 3.78788e-05
I0510 20:57:20.823001 3591 solver.cpp:404] Test net output #1: loss = 9.27223 (* 1 = 9.27223 loss)
I0510 20:57:21.423084 3591 solver.cpp:228] Iteration 0, loss = 9.29181
I0510 20:57:21.423110 3591 solver.cpp:244] Train net output #0: loss = 9.29181 (* 1 = 9.29181 loss)
I0510 20:57:21.423120 3591 sgd_solver.cpp:106] Iteration 0, lr = 0.001
I0510 21:06:57.498831 3591 solver.cpp:337] Iteration 1000, Testing net (#0)
I0510 21:10:59.477396 3591 solver.cpp:404] Test net output #0: accuracy = 0.00186553
I0510 21:10:59.477463 3591 solver.cpp:404] Test net output #1: loss = 8.86572 (* 1 = 8.86572 loss)
I0510 21:20:35.828510 3591 solver.cpp:337] Iteration 2000, Testing net (#0)
I0510 21:24:42.838196 3591 solver.cpp:404] Test net output #0: accuracy = 0.00144886
I0510 21:24:42.838245 3591 solver.cpp:404] Test net output #1: loss = 8.83859 (* 1 = 8.83859 loss)
I0510 21:24:43.412120 3591 solver.cpp:228] Iteration 2000, loss = 8.81461
I0510 21:24:43.412145 3591 solver.cpp:244] Train net output #0: loss = 8.81461 (* 1 = 8.81461 loss)
I0510 21:24:43.412150 3591 sgd_solver.cpp:106] Iteration 2000, lr = 0.001
I0510 21:38:50.990823 3591 solver.cpp:337] Iteration 3000, Testing net (#0)
I0510 21:42:52.918418 3591 solver.cpp:404] Test net output #0: accuracy = 0.00140152
I0510 21:42:52.918493 3591 solver.cpp:404] Test net output #1: loss = 8.81789 (* 1 = 8.81789 loss)
I0510 22:00:09.519151 3591 solver.cpp:337] Iteration 4000, Testing net (#0)
I0510 22:09:13.918016 3591 solver.cpp:404] Test net output #0: accuracy = 0.00149621
I0510 22:09:13.918102 3591 solver.cpp:404] Test net output #1: loss = 8.80909 (* 1 = 8.80909 loss)
I0510 22:09:15.127683 3591 solver.cpp:228] Iteration 4000, loss = 8.8597
I0510 22:09:15.127722 3591 solver.cpp:244] Train net output #0: loss = 8.8597 (* 1 = 8.8597 loss)
I0510 22:09:15.127729 3591 sgd_solver.cpp:106] Iteration 4000, lr = 0.001
I0510 22:28:39.320019 3591 solver.cpp:337] Iteration 5000, Testing net (#0)
I0510 22:37:43.847064 3591 solver.cpp:404] Test net output #0: accuracy = 0.00118371
I0510 22:37:43.847173 3591 solver.cpp:404] Test net output #1: loss = 8.80527 (* 1 = 8.80527 loss)
I0510 23:58:17.120088 3591 solver.cpp:454] Snapshotting to binary proto file /home/wang/caffe-master/examples/NN2_iter_10000.caffemodel
I0510 23:58:17.238307 3591 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/wang/caffe-master/examples/NN2_iter_10000.solverstate
I0510 23:58:17.491825 3591 solver.cpp:337] Iteration 10000, Testing net (#0)
I0511 00:02:19.412715 3591 solver.cpp:404] Test net output #0: accuracy = 0.00186553
I0511 00:02:19.412762 3591 solver.cpp:404] Test net output #1: loss = 8.79114 (* 1 = 8.79114 loss)
I0511 00:02:19.986547 3591 solver.cpp:228] Iteration 10000, loss = 8.83457
I0511 00:02:19.986570 3591 solver.cpp:244] Train net output #0: loss = 8.83457 (* 1 = 8.83457 loss)
I0511 00:02:19.986578 3591 sgd_solver.cpp:106] Iteration 10000, lr = 0.001
I0511 00:11:55.546052 3591 solver.cpp:337] Iteration 11000, Testing net (#0)
I0511 00:15:57.490486 3591 solver.cpp:404] Test net output #0: accuracy = 0.00164773
I0511 00:15:57.490532 3591 solver.cpp:404] Test net output #1: loss = 8.78702 (* 1 = 8.78702 loss)
I0511 00:25:33.666496 3591 solver.cpp:337] Iteration 12000, Testing net (#0)
I0511 00:29:35.603062 3591 solver.cpp:404] Test net output #0: accuracy = 0.0016572
I0511 00:29:35.603109 3591 solver.cpp:404] Test net output #1: loss = 8.7848 (* 1 = 8.7848 loss)
I0511 00:29:36.177078 3591 solver.cpp:228] Iteration 12000, loss = 9.00561
I0511 00:29:36.177105 3591 solver.cpp:244] Train net output #0: loss = 9.00561 (* 1 = 9.00561 loss)
I0511 00:29:36.177114 3591 sgd_solver.cpp:106] Iteration 12000, lr = 0.001
I0511 00:39:11.729369 3591 solver.cpp:337] Iteration 13000, Testing net (#0)
I0511 00:43:13.678067 3591 solver.cpp:404] Test net output #0: accuracy = 0.001875
I0511 00:43:13.678113 3591 solver.cpp:404] Test net output #1: loss = 8.78359 (* 1 = 8.78359 loss)
I0511 00:52:49.851985 3591 solver.cpp:337] Iteration 14000, Testing net (#0)
I0511 00:56:51.767343 3591 solver.cpp:404] Test net output #0: accuracy = 0.00154356
I0511 00:56:51.767390 3591 solver.cpp:404] Test net output #1: loss = 8.77998 (* 1 = 8.77998 loss)
I0511 00:56:52.341564 3591 solver.cpp:228] Iteration 14000, loss = 8.83385
I0511 00:56:52.341591 3591 solver.cpp:244] Train net output #0: loss = 8.83385 (* 1 = 8.83385 loss)
I0511 00:56:52.341598 3591 sgd_solver.cpp:106] Iteration 14000, lr = 0.001
I0511 02:14:38.224290 3591 solver.cpp:454] Snapshotting to binary proto file /home/wang/caffe-master/examples/NN2_iter_20000.caffemodel
I0511 02:14:38.735008 3591 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/wang/caffe-master/examples/NN2_iter_20000.solverstate
I0511 02:14:38.805809 3591 solver.cpp:337] Iteration 20000, Testing net (#0)
I0511 02:18:40.681993 3591 solver.cpp:404] Test net output #0: accuracy = 0.00179924
I0511 02:18:40.682086 3591 solver.cpp:404] Test net output #1: loss = 8.78129 (* 1 = 8.78129 loss)
I0511 02:18:41.255969 3591 solver.cpp:228] Iteration 20000, loss = 8.82502
I0511 02:18:41.255995 3591 solver.cpp:244] Train net output #0: loss = 8.82502 (* 1 = 8.82502 loss)
I0511 02:18:41.256001 3591 sgd_solver.cpp:106] Iteration 20000, lr = 0.001
I0511 04:30:58.924096 3591 solver.cpp:454] Snapshotting to binary proto file /home/wang/caffe-master/examples/NN2_iter_30000.caffemodel
I0511 04:31:00.742739 3591 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/wang/caffe-master/examples/NN2_iter_30000.solverstate
I0511 04:31:01.151980 3591 solver.cpp:337] Iteration 30000, Testing net (#0)
I0511 04:35:03.075263 3591 solver.cpp:404] Test net output #0: accuracy = 0.00186553
I0511 04:35:03.075307 3591 solver.cpp:404] Test net output #1: loss = 8.77867 (* 1 = 8.77867 loss)
I0511 04:35:03.649479 3591 solver.cpp:228] Iteration 30000, loss = 8.82915
I0511 04:35:03.649507 3591 solver.cpp:244] Train net output #0: loss = 8.82915 (* 1 = 8.82915 loss)
I0511 04:35:03.649513 3591 sgd_solver.cpp:106] Iteration 30000, lr = 0.001
I0511 07:55:36.848265 3591 solver.cpp:337] Iteration 45000, Testing net (#0)
I0511 07:59:38.834043 3591 solver.cpp:404] Test net output #0: accuracy = 0.00179924
I0511 07:59:38.834095 3591 solver.cpp:404] Test net output #1: loss = 8.77432 (* 1 = 8.77432 loss)
I0511 09:03:48.141854 3591 solver.cpp:454] Snapshotting to binary proto file /home/wang/caffe-master/examples/NN2_iter_50000.caffemodel
I0511 09:03:49.736464 3591 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/wang/caffe-master/examples/NN2_iter_50000.solverstate
I0511 09:03:49.797582 3591 solver.cpp:337] Iteration 50000, Testing net (#0)
I0511 09:07:51.777150 3591 solver.cpp:404] Test net output #0: accuracy = 0.001875
I0511 09:07:51.777207 3591 solver.cpp:404] Test net output #1: loss = 8.77058 (* 1 = 8.77058 loss)
I0511 09:07:52.351323 3591 solver.cpp:228] Iteration 50000, loss = 9.11435
I0511 09:07:52.351351 3591 solver.cpp:244] Train net output #0: loss = 9.11435 (* 1 = 9.11435 loss)
I0511 09:07:52.351357 3591 sgd_solver.cpp:106] Iteration 50000, lr = 0.001
I0511 09:17:28.188742 3591 solver.cpp:337] Iteration 51000, Testing net (#0)
I0511 09:21:30.200623 3591 solver.cpp:404] Test net output #0: accuracy = 0.00186553
I0511 09:21:30.200716 3591 solver.cpp:404] Test net output #1: loss = 8.77026 (* 1 = 8.77026 loss)
I0511 09:31:06.596501 3591 solver.cpp:337] Iteration 52000, Testing net (#0)
I0511 09:35:08.580215 3591 solver.cpp:404] Test net output #0: accuracy = 0.00182765
I0511 09:35:08.580313 3591 solver.cpp:404] Test net output #1: loss = 8.76917 (* 1 = 8.76917 loss)
I0511 09:35:09.154428 3591 solver.cpp:228] Iteration 52000, loss = 8.89758
I0511 09:35:09.154453 3591 solver.cpp:244] Train net output #0: loss = 8.89758 (* 1 = 8.89758 loss)
I0511 09:35:09.154459 3591 sgd_solver.cpp:106] Iteration 52000, lr = 0.001
I0511 09:44:44.906309 3591 solver.cpp:337] Iteration 53000, Testing net (#0)
I0511 09:48:46.866353 3591 solver.cpp:404] Test net output #0: accuracy = 0.00185606
I0511 09:48:46.866430 3591 solver.cpp:404] Test net output #1: loss = 8.7708 (* 1 = 8.7708 loss)
I0511 09:58:23.097244 3591 solver.cpp:337] Iteration 54000, Testing net (#0)
I0511 10:02:25.056555 3591 solver.cpp:404] Test net output #0: accuracy = 0.00192235
I0511 10:02:25.056605 3591 solver.cpp:404] Test net output #1: loss = 8.76884 (* 1 = 8.76884 loss)
I0511 10:02:25.630312 3591 solver.cpp:228] Iteration 54000, loss = 8.90552
I0511 10:02:25.630337 3591 solver.cpp:244] Train net output #0: loss = 8.90552 (* 1 = 8.90552 loss)
I0511 10:02:25.630342 3591 sgd_solver.cpp:106] Iteration 54000, lr = 0.001
I0511 14:44:51.563555 3591 solver.cpp:337] Iteration 75000, Testing net (#0)
I0511 14:48:53.573640 3591 solver.cpp:404] Test net output #0: accuracy = 0.0016572
I0511 14:48:53.573724 3591 solver.cpp:404] Test net output #1: loss = 8.76967 (* 1 = 8.76967 loss)
I0511 14:58:30.080453 3591 solver.cpp:337] Iteration 76000, Testing net (#0)
I0511 15:02:32.076011 3591 solver.cpp:404] Test net output #0: accuracy = 0.001875
I0511 15:02:32.076077 3591 solver.cpp:404] Test net output #1: loss = 8.7695 (* 1 = 8.7695 loss)
I0511 15:02:32.650342 3591 solver.cpp:228] Iteration 76000, loss = 9.0084
I0511 15:02:32.650367 3591 solver.cpp:244] Train net output #0: loss = 9.0084 (* 1 = 9.0084 loss)
I0511 15:02:32.650373 3591 sgd_solver.cpp:106] Iteration 76000, lr = 0.001
I0511 15:12:08.597450 3591 solver.cpp:337] Iteration 77000, Testing net (#0)
I0511 15:16:10.636613 3591 solver.cpp:404] Test net output #0: accuracy = 0.00181818
I0511 15:16:10.636693 3591 solver.cpp:404] Test net output #1: loss = 8.76889 (* 1 = 8.76889 loss)
I0511 15:25:47.167667 3591 solver.cpp:337] Iteration 78000, Testing net (#0)
I0511 15:29:49.204596 3591 solver.cpp:404] Test net output #0: accuracy = 0.00185606
I0511 15:29:49.204649 3591 solver.cpp:404] Test net output #1: loss = 8.77059 (* 1 = 8.77059 loss)
I0511 15:29:49.779094 3591 solver.cpp:228] Iteration 78000, loss = 8.73139
I0511 15:29:49.779119 3591 solver.cpp:244] Train net output #0: loss = 8.73139 (* 1 = 8.73139 loss)
I0511 15:29:49.779124 3591 sgd_solver.cpp:106] Iteration 78000, lr = 0.001
I0511 15:39:25.730358 3591 solver.cpp:337] Iteration 79000, Testing net (#0)
I0511 15:43:27.756417 3591 solver.cpp:404] Test net output #0: accuracy = 0.00192235
I0511 15:43:27.756485 3591 solver.cpp:404] Test net output #1: loss = 8.76846 (* 1 = 8.76846 loss)
I0511 15:53:04.419961 3591 solver.cpp:454] Snapshotting to binary proto file /home/wang/caffe-master/examples/NN2_iter_80000.caffemodel
I0511 15:53:06.138357 3591 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/wang/caffe-master/examples/NN2_iter_80000.solverstate
I0511 15:53:06.519551 3591 solver.cpp:337] Iteration 80000, Testing net (#0)
I0511 15:57:08.719681 3591 solver.cpp:404] Test net output #0: accuracy = 0.00164773
I0511 15:57:08.719737 3591 solver.cpp:404] Test net output #1: loss = 8.77126 (* 1 = 8.77126 loss)
I0511 15:57:09.294163 3591 solver.cpp:228] Iteration 80000, loss = 8.56576
I0511 15:57:09.294188 3591 solver.cpp:244] Train net output #0: loss = 8.56576 (* 1 = 8.56576 loss)
I0511 15:57:09.294193 3591 sgd_solver.cpp:106] Iteration 80000, lr = 0.001
I0511 17:01:19.190099 3591 solver.cpp:337] Iteration 85000, Testing net (#0)
I0511 17:05:21.148668 3591 solver.cpp:404] Test net output #0: accuracy = 0.00185606
I0511 17:05:21.148733 3591 solver.cpp:404] Test net output #1: loss = 8.77196 (* 1 = 8.77196 loss)
I0511 17:14:57.670343 3591 solver.cpp:337] Iteration 86000, Testing net (#0)
I0511 17:18:59.659850 3591 solver.cpp:404] Test net output #0: accuracy = 0.00181818
I0511 17:18:59.659907 3591 solver.cpp:404] Test net output #1: loss = 8.77126 (* 1 = 8.77126 loss)
I0511 17:19:00.234335 3591 solver.cpp:228] Iteration 86000, loss = 8.72875
I0511 17:19:00.234359 3591 solver.cpp:244] Train net output #0: loss = 8.72875 (* 1 = 8.72875 loss)
I0511 17:19:00.234364 3591 sgd_solver.cpp:106] Iteration 86000, lr = 0.001
I0511 17:28:36.196920 3591 solver.cpp:337] Iteration 87000, Testing net (#0)
I0511 17:32:38.181174 3591 solver.cpp:404] Test net output #0: accuracy = 0.00181818
I0511 17:32:38.181231 3591 solver.cpp:404] Test net output #1: loss = 8.771 (* 1 = 8.771 loss)
I0511 17:42:14.658293 3591 solver.cpp:337] Iteration 88000, Testing net (#0)
I0511 17:46:16.614358 3591 solver.cpp:404] Test net output #0: accuracy = 0.00188447
I0511 17:46:16.614415 3591 solver.cpp:404] Test net output #1: loss = 8.76964 (* 1 = 8.76964 loss)
I0511 17:46:17.188212 3591 solver.cpp:228] Iteration 88000, loss = 8.80409
I0511 17:46:17.188233 3591 solver.cpp:244] Train net output #0: loss = 8.80409 (* 1 = 8.80409 loss)
I0511 17:46:17.188240 3591 sgd_solver.cpp:106] Iteration 88000, lr = 0.001
I0511 17:55:53.358322 3591 solver.cpp:337] Iteration 89000, Testing net (#0)
I0511 17:59:55.305763 3591 solver.cpp:404] Test net output #0: accuracy = 0.00186553
I0511 17:59:55.305868 3591 solver.cpp:404] Test net output #1: loss = 8.76909 (* 1 = 8.76909 loss)
I0511 18:09:31.658655 3591 solver.cpp:454] Snapshotting to binary proto file /home/wang/caffe-master/examples/NN2_iter_90000.caffemodel
I0511 18:09:33.138741 3591 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/wang/caffe-master/examples/NN2_iter_90000.solverstate
I0511 18:09:33.691995 3591 solver.cpp:337] Iteration 90000, Testing net (#0)
I0511 18:13:35.626065 3591 solver.cpp:404] Test net output #0: accuracy = 0.00168561
I0511 18:13:35.626148 3591 solver.cpp:404] Test net output #1: loss = 8.76973 (* 1 = 8.76973 loss)
I0511 18:13:36.200448 3591 solver.cpp:228] Iteration 90000, loss = 8.97326
I0511 18:13:36.200469 3591 solver.cpp:244] Train net output #0: loss = 8.97326 (* 1 = 8.97326 loss)
I0511 18:13:36.200474 3591 sgd_solver.cpp:106] Iteration 90000, lr = 0.001
I0511 19:31:23.715662 3591 solver.cpp:337] Iteration 96000, Testing net (#0)
I0511 19:35:25.677780 3591 solver.cpp:404] Test net output #0: accuracy = 0.00188447
I0511 19:35:25.677836 3591 solver.cpp:404] Test net output #1: loss = 8.7695 (* 1 = 8.7695 loss)
I0511 19:35:26.251850 3591 solver.cpp:228] Iteration 96000, loss = 8.74232
I0511 19:35:26.251875 3591 solver.cpp:244] Train net output #0: loss = 8.74232 (* 1 = 8.74232 loss)
I0511 19:35:26.251880 3591 sgd_solver.cpp:106] Iteration 96000, lr = 0.001
I0511 19:45:02.057610 3591 solver.cpp:337] Iteration 97000, Testing net (#0)
I0511 19:49:04.029269 3591 solver.cpp:404] Test net output #0: accuracy = 0.00188447
I0511 19:49:04.029357 3591 solver.cpp:404] Test net output #1: loss = 8.77655 (* 1 = 8.77655 loss)
I0511 19:58:40.265120 3591 solver.cpp:337] Iteration 98000, Testing net (#0)
I0511 20:02:42.182787 3591 solver.cpp:404] Test net output #0: accuracy = 0.00183712
I0511 20:02:42.182859 3591 solver.cpp:404] Test net output #1: loss = 8.77069 (* 1 = 8.77069 loss)
I0511 20:02:42.756922 3591 solver.cpp:228] Iteration 98000, loss = 8.61745
I0511 20:02:42.756944 3591 solver.cpp:244] Train net output #0: loss = 8.61745 (* 1 = 8.61745 loss)
Duo to the limit of characters of codes, I have to delete some rows of the log. However, it doesn’t matter. As you can see, there is no difference between "Iteration 98000" and "Iteration 0". I am really puzzled with this situation.
This is the architecture of my model
name: "NN2"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
mean_file :"/home/jiayi-wei/caffe/examples/NN2/image_train_mean.binaryproto"
data_param {
source: "/home/jiayi-wei/caffe/examples/NN2/img_train_lmdb"
batch_size: 30
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
mean_file :"/home/jiayi-wei/caffe/examples/NN2/image_train_mean.binaryproto"
data_param {
source: "/home/jiayi-wei/caffe/examples/NN2/img_val_lmdb"
batch_size: 11
backend: LMDB
}
}
#first layers
layer {
name: "conv11"
type: "Convolution"
bottom: "data"
top: "conv11"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu11"
type: "ReLU"
bottom: "conv11"
top: "conv11"
}
layer {
name: "conv12"
type: "Convolution"
bottom: "conv11"
top: "conv12"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu12"
type: "ReLU"
bottom: "conv12"
top: "conv12"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv12"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
#second layers
layer {
name: "conv21"
type: "Convolution"
bottom: "pool1"
top: "conv21"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu21"
type: "ReLU"
bottom: "conv21"
top: "conv21"
}
layer {
name: "conv22"
type: "Convolution"
bottom: "conv21"
top: "conv22"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu22"
type: "ReLU"
bottom: "conv22"
top: "conv22"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv22"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
#third layers
layer {
name: "conv31"
type: "Convolution"
bottom: "pool2"
top: "conv31"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu31"
type: "ReLU"
bottom: "conv31"
top: "conv31"
}
layer {
name: "conv32"
type: "Convolution"
bottom: "conv31"
top: "conv32"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu32"
type: "ReLU"
bottom: "conv32"
top: "conv32"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv32"
top: "pool3"
pooling_param {
pool: MAX
pad:1
kernel_size: 2
stride: 2
}
}
#fourth layer
layer {
name: "conv41"
type: "Convolution"
bottom: "pool3"
top: "conv41"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu41"
type: "ReLU"
bottom: "conv41"
top: "conv41"
}
layer {
name: "conv42"
type: "Convolution"
bottom: "conv41"
top: "conv42"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu42"
type: "ReLU"
bottom: "conv42"
top: "conv42"
}
layer {
name: "conv43"
type: "Convolution"
bottom: "conv42"
top: "conv43"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu43"
type: "ReLU"
bottom: "conv43"
top: "conv43"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv43"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
#fiveth layer
layer {
name: "conv51"
type: "Convolution"
bottom: "pool4"
top: "conv51"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu51"
type: "ReLU"
bottom: "conv51"
top: "conv51"
}
layer {
name: "conv52"
type: "Convolution"
bottom: "conv51"
top: "conv52"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu52"
type: "ReLU"
bottom: "conv52"
top: "conv52"
}
layer {
name: "conv53"
type: "Convolution"
bottom: "conv52"
top: "conv53"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad:1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv53"
top: "pool5"
pooling_param {
pool: AVE
pad:1
kernel_size: 2
stride: 2
}
}
#drop_Fc
layer {
name: "dropout"
type: "Dropout"
bottom: "pool5"
top: "pool5"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output:1000
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output:10575
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc7"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "SoftMax"
type: "SoftmaxWithLoss"
bottom: "fc7"
bottom: "label"
top: "SoftMax"
}
Following is my solver. And i have change base_lr to "0.001"
net: "train_val.prototxt"
test_iter: 10000
test_interval: 1000
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 20
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "/home/jiayi-wei/caffe/examples/NN2"
solver_mode: GPU
I have tried to change some parametric and I have already tried to reduce a "conv" layer from the block who has three "conv" layers. However the result always keep like the picture shows.
Please tell me how can i make out the problem? thanks
Upvotes: 0
Views: 4200
Reputation: 1608
From your log, it seems that your model tended to keep predicting label unchangingly during training and namely, your training diverged. I advise you to make the following check.
By the way, you are training a task of classifying 10575 classes with every class only having about 40 training samples, so to some extent, training data is insufficient. So like work in the base line, to enhance the model's ability to distinguish the same and the different samples, it's better to add a Contrastive cost besides a Softmax cost.
Reference Sun Y, Chen Y, Wang X, et al. Deep learning face representation by joint identification-verification[C]//Advances in Neural Information Processing Systems. 2014: 1988-1996.
Upvotes: 0
Reputation: 2148
Your base_lr
seems to be high. Start with a base_lr
of 0.001
and go on reducing it by a factor of 10 whenever you stop seeing improvement in accuracy for several thousand iterations.
NOTE: This is just a rule of thumb, it may not work in all cases.
Upvotes: 1