Reputation: 304
A code that uses OpenCV and Caffe has worked on all Linux devices that i have tested it on. However launching it on a successfully installed Jetson TX2 causes a segmentation fault with this stack trace:
nvidia@tegra-ubuntu:~/Desktop$ gdb ./main
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
This GDB was configured as "aarch64-linux-gnu".
Reading symbols from ./main...done.
(gdb) r
Starting program: /home/nvidia/Desktop/main
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".Program received signal SIGSEGV, Segmentation fault.
0x0000007fb5e5d14c in google::protobuf::Arena::AllocateAligned(std::type_info const*, unsigned long) () from /usr/local/lib/libopencv_dnn.so.3.3
(gdb) bt
#0 0x0000007fb5e5d14c in google::protobuf::Arena::AllocateAligned(std::type_info const*, unsigned long) () from /usr/local/lib/libopencv_dnn.so.3.3
#1 0x0000007fb5e5d248 in google::protobuf::Arena::AddListNode(void*, void ()(void)) ()
from /usr/local/lib/libopencv_dnn.so.3.3
#2 0x0000007fb5eaaf34 in google::protobuf::FileDescriptorProto::New(google::protobuf::Arena*) const [clone .localalias.409] () from /usr/local/lib/libopencv_dnn.so.3.3
#3 0x0000007fad71bfc4 in google::protobuf::MessageLite::ParseFromArray(void const*, int)
() from /usr/lib/aarch64-linux-gnu/libprotobuf.so.9
#4 0x0000007fad763e70 in google::protobuf::EncodedDescriptorDatabase::Add(void const*, int) () from /usr/lib/aarch64-linux-gnu/libprotobuf.so.9
#5 0x0000007fad726d30 in google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int) () from /usr/lib/aarch64-linux-gnu/libprotobuf.so.9
#6 0x0000007fad7560bc in google::protobuf::protobuf_AddDesc_google_2fprotobuf_2fdescriptor_2eproto() () from /usr/lib/aarch64-linux-gnu/libprotobuf.so.9
#7 0x0000007fb7fdfb18 in call_init (l=<optimized out>, argc=argc@entry=1,
argv=argv@entry=0x7ffffff478, env=env@entry=0x7ffffff488) at dl-init.c:72
#8 0x0000007fb7fdfc60 in call_init (env=0x7ffffff488, argv=0x7ffffff478, argc=1,
l=<optimized out>) at dl-init.c:30
#9 _dl_init (main_map=0x7fb8000190, argc=1, argv=0x7ffffff478, env=0x7ffffff488)
at dl-init.c:120
#10 0x0000007fb7fd2d44 in _dl_start_user () from /lib/ld-linux-aarch64.so.1
Protobuf shows up a lot, so i installed protobuf 3.3 and recompiled everything, but that did not help either. Simple code examples using OpenCV do work and Caffe runtest was passed. How do i search for a solution to this segfault?
Arvids
Upvotes: 0
Views: 535
Reputation: 213937
How do i search for a solution to this segfault?
You don't.
Instead of searching for a solution, you find it yourself, by debugging the problem.
The first step should likely be to install debug info package for libopencv_dnn.so.3.3
, or building it from source, so you can understand where in the Arena allocator your code is crashing.
Just with any crash in malloc
, the problem is most likely in the user code, not the Arena allocator itself. The problem could be a stray write (i.e. random corruption) or, more likely, the API mis-use (e.g. calling Arena::Deallocate
on something that wasn't allocated from that Arena).
P.S. The bug likely exists on other architectures, but hasn't announced itself yet. Heap corruption bugs often do that.
Upvotes: 1