Reputation: 45
I'm trying to create a custom tensorflow Op. I have gotten some ops to works using https://www.tensorflow.org/versions/master/how_tos/adding_an_op/index.html and normal C++.
The problem arises when using the Eigen C++ framework and its .sum
reducer. It works for CPU with the double
, float
and Eigen::half
types, but on the GPU it breaks when using Eigen::half
, at compile time.
I have reduced the problem to a copy of the l2loss_op
from https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/kernels, where I have renamed L2Loss
to CustomL2Loss
(otherwise I get name conflicts). See: https://gist.github.com/AndreasMadsen/4335215cd4293daad3cad745bbeae82a
The error is quite long: https://gist.github.com/AndreasMadsen/5cd0579267f0bc3e5a1c21f2341d9ad6
Since it works for all other cases but <GPUDevice, Eigen::half>
(confirmed by commenting the line out in l2loss_op.cu.cc
) I was considering if this was a tensorflow issue. But I can compile tensorflow itself.
Upvotes: 1
Views: 507
Reputation: 1469
Support for half floats requires cuda architecture greater than or equal to 3.5. You need to compile with the -arch compute_35 flag to enable the corresponding instructions.
Upvotes: 2