Eigen sum in custom tensorflow C++ op with Eigen::half on GPUDevice

Question

I'm trying to create a custom tensorflow Op. I have gotten some ops to works using https://www.tensorflow.org/versions/master/how_tos/adding_an_op/index.html and normal C++.

The problem arises when using the Eigen C++ framework and its .sum reducer. It works for CPU with the double, float and Eigen::half types, but on the GPU it breaks when using Eigen::half, at compile time.

I have reduced the problem to a copy of the l2loss_op from https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/kernels, where I have renamed L2Loss to CustomL2Loss (otherwise I get name conflicts). See: https://gist.github.com/AndreasMadsen/4335215cd4293daad3cad745bbeae82a

The error is quite long: https://gist.github.com/AndreasMadsen/5cd0579267f0bc3e5a1c21f2341d9ad6

Since it works for all other cases but (confirmed by commenting the line out in l2loss_op.cu.cc) I was considering if this was a tensorflow issue. But I can compile tensorflow itself.

Benoit Steiner · Accepted Answer

Support for half floats requires cuda architecture greater than or equal to 3.5. You need to compile with the -arch compute_35 flag to enable the corresponding instructions.

Eigen sum in custom tensorflow C++ op with Eigen::half on GPUDevice

Answers (1)

Related Questions