Reputation: 41
I just updated tensorflow with pip3 (now to version 1.4.1). After it I am having problems:
I have a custom op library that I compile with -D _GLIBCXX_USE_CXX11_ABI=0. The library compiles and links fine. Importing it into tensorflow gives:
Traceback (most recent call last):
...
File "../x.py", line 29, in <module>
lib = tf.load_op_library(_lib_path)
File "/home/ilge/.local/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/ilge/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /path/to/mylib.so: undefined symbol: _ZTIN10tensorflow8OpKernelE
It seems it cannot load general tensorflow symbols. Hints on how I could debug it are very appreciated. Note that before the update and before recompiling everything was working.
Upvotes: 4
Views: 12888
Reputation: 3094
I was compiling and linking in two different steps in my make file, and just using the proper link flags when linking wasn't enough. I also had to pass the argument -Wl,--no-as-needed
to the linker, because for some reason gcc was discarding the library in the final module (as shown by ldd).
So my Makefile looks like this
TF_CFLAGS:=$(shell python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))')
TF_LFLAGS:=$(shell python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))')
all: myop_ops.so
%.o: %.cc
g++ -fPIC $(TF_CFLAGS) -O2 -std=c++11 -I/usr/local/include -c $< -o $@
myop_ops.so: myfile1.o myfile2.o myop_kernel.o myop_ops.o
g++ -shared -Wl,--no-as-needed $(TF_LFLAGS) -o $@ $^
Upvotes: 2
Reputation: 5808
See the updated custom op instructions: https://www.tensorflow.org/extend/adding_an_op#compile_the_op_using_your_system_compiler_tensorflow_binary_installation
In particular:
>>> tf.sysconfig.get_link_flags()
['-L/usr/local/lib/python3.6/dist-packages/tensorflow', '-ltensorflow_framework']
Custom ops are now (in TensorFlow 1.4+) registered by linking against libtensorflow_framework.so
. Previously TensorFlow loaded the necessary symbols into the global symbol table for the Python process (using RTLD_GLOBAL
).
Upvotes: 2
Reputation: 122
There can be a compatibility issue between the tensorflow and gcc versions. Check the version of gcc that tensorflow uses for building and use that version of gcc to compile your custom op library. For e.g. I installed tensorflow 1.6.0 with Anaconda2 and it uses gcc version 7.2. So I kept getting the same error as you when I compiled custom ops with gcc 4.8/4.9/5.3. Finally, I tried with gcc 7.3 and it worked.
Upvotes: 0