Reputation: 595
Working on a new loss layer in Caffe,
I have some values in diff_.cpu_data()
, Lets name each element of that as Di
:
Now, I want to compute this function for each Di
:
and assign the result to the corresponding element in bottom[0]->mutable_cpu_diff()
of the layer.
As you see, for the second term there is no need to loop over input and output variables ( diff_.cpu_data
& bottom[0]->mutable_cpu_diff()
respectively), while in the first term, I need to access the value of each element in input variable, then of course I need to assign the result of the function to the corresponding element of the output variable, If they were 2-D array, obviously I could do something like this:
But as you know, those variables are 4-D arrays and it is not clear for me how to do that.
Should I use Offset()
function or something like that to loop over all the elements of those variables similar to this?
Could someone please explain it to me or refer me to a useful reference?
Thanks,
Upvotes: 1
Views: 468
Reputation: 114796
First, you should store the result of the second term (1/N^2 \sum_i D_i) to a local variable. As you already know, this sum can be computed using caffe_cpu_dot
.
So your code may look something like:
vector<Dtype> mult_data( diff_.count(), Dtype(1) );
const Dtype* Di = diff_.cpu_data();
Dtype const_sum = caffe_cpu_dot( diff_.count(), &mult_data[0], Di );
Dtype N = diff_.count();
const_sum /= N*N; // divide by N^2
Now you can loop over all items (assuming bottom[0]->count()==diff_.count()
):
Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
for (int i=0; i<diff_.count(); i++) {
bottom_diff[i] = 2*Di[i]/N - const_sum;
}
Alternatively, if you want something more "blas"-like:
caffe_copy(diff_.count(), Di, bottom_diff); // copy Di to bottom_diff
caffe_scal(diff_.count(), Dtype(2.0/N), bottom_diff); // bottom_diff is now (2/N)*Di
caffe_add_scalar(diff_.count(), -const_sum, bottom_diff); // subtract the second term from all the elements.
Upvotes: 1