Eddy
Eddy

Reputation: 41

undefined symbol: _ZTIN10tensorflow8OpKernelE

I just updated tensorflow with pip3 (now to version 1.4.1). After it I am having problems:

I have a custom op library that I compile with -D _GLIBCXX_USE_CXX11_ABI=0. The library compiles and links fine. Importing it into tensorflow gives:

Traceback (most recent call last):
  ...
  File "../x.py", line 29, in <module>
    lib = tf.load_op_library(_lib_path)
  File "/home/ilge/.local/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
  File "/home/ilge/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /path/to/mylib.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

It seems it cannot load general tensorflow symbols. Hints on how I could debug it are very appreciated. Note that before the update and before recompiling everything was working.

Upvotes: 4

Views: 12888

Answers (4)

Anis
Anis

Reputation: 3094

I was compiling and linking in two different steps in my make file, and just using the proper link flags when linking wasn't enough. I also had to pass the argument -Wl,--no-as-needed to the linker, because for some reason gcc was discarding the library in the final module (as shown by ldd).

So my Makefile looks like this

TF_CFLAGS:=$(shell python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))')
TF_LFLAGS:=$(shell python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))')

all: myop_ops.so

%.o: %.cc
    g++ -fPIC $(TF_CFLAGS) -O2 -std=c++11 -I/usr/local/include -c $< -o $@

myop_ops.so: myfile1.o myfile2.o myop_kernel.o myop_ops.o
    g++ -shared -Wl,--no-as-needed $(TF_LFLAGS) -o $@ $^

Upvotes: 2

aobai
aobai

Reputation: 7

 pip uninstall -y horovod


 pip install --no-cache-dir horovod

Upvotes: -1

Allen Lavoie
Allen Lavoie

Reputation: 5808

See the updated custom op instructions: https://www.tensorflow.org/extend/adding_an_op#compile_the_op_using_your_system_compiler_tensorflow_binary_installation

In particular:

>>> tf.sysconfig.get_link_flags()
['-L/usr/local/lib/python3.6/dist-packages/tensorflow', '-ltensorflow_framework']

Custom ops are now (in TensorFlow 1.4+) registered by linking against libtensorflow_framework.so. Previously TensorFlow loaded the necessary symbols into the global symbol table for the Python process (using RTLD_GLOBAL).

Upvotes: 2

titan
titan

Reputation: 122

There can be a compatibility issue between the tensorflow and gcc versions. Check the version of gcc that tensorflow uses for building and use that version of gcc to compile your custom op library. For e.g. I installed tensorflow 1.6.0 with Anaconda2 and it uses gcc version 7.2. So I kept getting the same error as you when I compiled custom ops with gcc 4.8/4.9/5.3. Finally, I tried with gcc 7.3 and it worked.

Upvotes: 0

Related Questions