Training Siamese Network in Caffe

Question

I am trying to build a siamese network for comparing two image samples. I followed the MNIST example in caffe.

What I am trying to do is not to use Fully connected layers but rather Fully Convolutional Siamese network. I am just doing this for learning and understanding deep learning.

I created my own custom network, that takes 32 x 32 size RGB image patch and runs through several layers of the network defined in the attached Prototxt file. Note to keep it short, I deleted the other half of the network which is just a mirror. Also I am trying to learn how to use Padding in Convolutional Layers so I am also trying that in my example here. You will see that I put a padding of 1 on conv3 layer.

label1 and label2 are same so I used silent layer to block label2

layer {
  name: "data1"
  type: "Data"
  top: "data1"
  top: "label"
  include {
    phase: TRAIN
  }
  data_param {
    source: "Desktop/training/lmdb/train_1"
    batch_size: 512
    backend: LMDB
  }
}

layer {
  name: "data2"
  type: "Data"
  top: "data2"
  top: "label2"
  include {
    phase: TRAIN
  }
  data_param {
    source: "/Desktop/training/lmdb/train_2"
    batch_size: 512
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data1"
  top: "conv1"
  param {
    name: "conv1_w"
    lr_mult: 1
  }
  param {
    name: "conv1_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 0
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
      std: 0.03
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "norm1"
  top: "conv2"
  param {
    name: "conv2_w"
    lr_mult: 1
  }
  param {
    name: "conv2_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 0
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
      std: 0.03
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "conv2"
  top: "conv3"
  param {
    name: "conv3_w"
    lr_mult: 1
  }
  param {
    name: "conv3_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    stride: 2
    weight_filler {
      type: "xavier"
      std: 0.03
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
# layer {
#   name: "dropout"
#   type: "Dropout"
#   bottom: "conv3"
#   top: "dropout"
#   dropout_param {
#     dropout_ratio: 0.5
#   }
# }
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    name: "conv4_w"
    lr_mult: 1
  }
  param {
    name: "conv4_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 1
    pad: 0
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "xavier"
      std: 0.03
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv4"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 7
    stride: 1


     }
    }

#################
layer {
  name: "loss"
  type: "ContrastiveLoss"
  bottom: "pool2"
  bottom: "pool2_p"
  bottom: "label"
  top: "loss"
  contrastive_loss_param {
    margin: 1
  }
  include { 
    phase: TRAIN 
    }
}

There are few things I am confused with:

Is it safe to add padding on the convolutional layer or can it have destructive effect?
In some papers I read on siamaese network, they use L2-Normalization after Fully connected layer. I didnt find any L2-Normalization layer on caffe but I support LRN can do the same thing by setting alpha = 1 and beta = 0.5.
In my network, I just average pooled the conv4 layer and used that to compute the loss using ContrastiveLoss. Can that work, or I need to normalize the output of conv4 or am I doing something completely wrong here.
Can outputs of convolutional layer be directly fed into loss functions?

I would really appreciate you help in showing me the right direction. Besides I am using sample images of about 50K patches of some cells that I cannot publish as it is classified. The patch size is about 25x25 so I resize to 32x32

Bharat · Accepted Answer

Yes, it is safe to add padding to conv layers. I think you can use LRN layers for L2 normalization the way it is described in the documentation. Yes, outputs of CNN layers can be used directly in loss functions, nothing wrong with that, its just a blob. In fully convolutional networks, it is always the case. Atleast in theory, your output need not be constrained for contrastive loss to work as it is a margin based loss. Typically, changing contrastive loss to a binary classification problem with softmax loss generally works and doesn't have normalization issues.

Training Siamese Network in Caffe

Answers (1)

Related Questions