Reputation: 23
In my application I've an erlang node which periodically communicates with the C-node every 1 second to gather periodic data such as alarm and performance.
The C-node is implemented in such a way that, it consists of two threads, the main
thread receives the requests from the erlang node and passes the message to the worker
thread. The worker
thread serves the queries and replies to erlang node. To reply to the erlang node all the
data that is collected in the cnode need to converted to erlang format
(in my case to a list of tuples) using the function erl_format
.
The problem observed here is, after running for approximately 45 minutes,
I'm incurring a glibc
error which complains about a memory corruption.
What could be the probable cause for this?
I'm using the 3.9 version of the erl_interface
libraries which are compiled
with thread safe options(such as _REENTRANT)
Please find below the log that I got where glibc complains about a possible memory corruption
When I did a addr2line
on the address 0x101bb12c it pointed to the
erl_format()
*** glibc detected *** /root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode: malloc(): memory corruption (fast): 0x1021fb08 ***
======= Backtrace: =========
/lib/libc.so.6[0xfd84610]
/lib/libc.so.6[0xfd864fc]
/lib/libc.so.6(__libc_malloc+0xb4)[0xfd887b8]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode(erl_eterm_alloc+0xac)[0x101ba1fc]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode(erl_alloc_eterm+0x2c)[0x101bb848]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode(erl_mk_tuple+0x94)[0x101b88c0]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode[0x101baf00]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode[0x101bb1bc]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode[0x101baf58]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode[0x101bb300]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode[0x101bb12c]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode(erl_format+0x7c)[0x101bb12c]
/root/rel-1.0.0/galaxy/lib/galaxy-1.6.0/priv/hyphy_cnode(query_handler+0x4264)[0x100235fc]
/lib/libpthread.so.0[0xff967f4]
/lib/libc.so.6(clone+0x8c)[0xfde226c]
======= Memory map: ========
00100000-00103000 r-xp 00100000 00:00 0 [vdso]
0fc31000-0fc41000 r-xp 00000000 fd:01 3213 /lib/libresolv-2.5.so
0fc41000-0fc50000 ---p 00010000 fd:01 3213 /lib/libresolv-2.5.so
0fc50000-0fc51000 r--p 0000f000 fd:01 3213 /lib/libresolv-2.5.so
0fc51000-0fc52000 rwxp 00010000 fd:01 3213 /lib/libresolv-2.5.so
0fc52000-0fc54000 rwxp 0fc52000 00:00 0
0fc64000-0fc68000 r-xp 00000000 fd:01 3214 /lib/libnss_dns-2.5.so
0fc68000-0fc77000 ---p 00004000 fd:01 3214 /lib/libnss_dns-2.5.so
0fc77000-0fc78000 r--p 00003000 fd:01 3214 /lib/libnss_dns-2.5.so
0fc78000-0fc79000 rwxp 00004000 fd:01 3214 /lib/libnss_dns-2.5.so
0fc89000-0fc93000 r-xp 00000000 fd:01 3223 /lib/libnss_nis-2.5.so
0fc93000-0fca2000 ---p 0000a000 fd:01 3223 /lib/libnss_nis-2.5.so
0fca2000-0fca3000 r--p 00009000 fd:01 3223 /lib/libnss_nis-2.5.so
0fca3000-0fca4000 rwxp 0000a000 fd:01 3223 /lib/libnss_nis-2.5.so
0fcb4000-0fcc0000 r-xp 00000000 fd:01 3243 /lib/libnss_nisplus-2.5.so
0fcc0000-0fccf000 ---p 0000c000 fd:01 3243 /lib/libnss_nisplus-2.5.so
0fccf000-0fcd0000 r--p 0000b000 fd:01 3243 /lib/libnss_nisplus-2.5.so
0fcd0000-0fcd1000 rwxp 0000c000 fd:01 3243 /lib/libnss_nisplus-2.5.so
0fce1000-0fceb000 r-xp 00000000 fd:01 3240 /lib/libnss_files-2.5.so
0fceb000-0fcfa000 ---p 0000a000 fd:01 3240 /lib/libnss_files-2.5.so
0fcfa000-0fcfb000 r--p 00009000 fd:01 3240 /lib/libnss_files-2.5.so
0fcfb000-0fcfc000 rwxp 0000a000 fd:01 3240 /lib/libnss_files-2.5.so
0fd0c000-0fe49000 r-xp 00000000 fd:01 3215 /lib/libc-2.5.so
0fe49000-0fe59000 ---p 0013d000 fd:01 3215 /lib/libc-2.5.so
0fe59000-0fe5b000 r--p 0013d000 fd:01 3215 /lib/libc-2.5.so
0fe5b000-0fe5e000 rwxp 0013f000 fd:01 3215 /lib/libc-2.5.so
0fe5e000-0fe61000 rwxp 0fe5e000 00:00 0
0fe71000-0fe7a000 r-xp 00000000 fd:01 3272 /lib/librt-2.5.so
0fe7a000-0fe89000 ---p 00009000 fd:01 3272 /lib/librt-2.5.so
0fe89000-0fe8a000 r--p 00008000 fd:01 3272 /lib/librt-2.5.so
0fe8a000-0fe8b000 rwxp 00009000 fd:01 3272 /lib/librt-2.5.so
0fe8b000-0fe96000 rwxp 0fe8b000 00:00 0
0fea6000-0ff49000 r-xp 00000000 fd:01 3211 /lib/libm-2.5.so
0ff49000-0ff58000 ---p 000a3000 fd:01 3211 /lib/libm-2.5.so
0ff58000-0ff59000 r--p 000a2000 fd:01 3211 /lib/libm-2.5.so
0ff59000-0ff5d000 rwxp 000a3000 fd:01 3211 /lib/libm-2.5.so
0ff6d000-0ff70000 r-xp 00000000 fd:01 3202 /lib/libdl-2.5.so
0ff70000-0ff7f000 ---p 00003000 fd:01 3202 /lib/libdl-2.5.so
0ff7f000-0ff80000 r--p 00002000 fd:01 3202 /lib/libdl-2.5.so
0ff80000-0ff81000 rwxp 00003000 fd:01 3202 /lib/libdl-2.5.so
0ff91000-0ffa6000 r-xp 00000000 fd:01 3246 /lib/libpthread-2.5.so
0ffa6000-0ffb5000 ---p 00015000 fd:01 3246 /lib/libpthread-2.5.so
0ffb5000-0ffb6000 r--p 00014000 fd:01 3246 /lib/libpthread-2.5.so
0ffb6000-0ffb7000 rwxp 00015000 fd:01 3246 /lib/libpthread-2.5.so
0ffb7000-0ffb9000 rwxp 0ffb7000 00:00 0
0ffc9000-
Upvotes: 0
Views: 140
Reputation: 213686
What could be the probable cause for this?
The most probable cause is heap corruption due to a bug somewhere in your program.
Some of the bugs that cause heap corruption: writing past the end of malloc
ed buffer, free
ing unallocated memory, free
ing memory twice, writing to dangling (allocated but already free
d buffer), etc. etc.
The standard tools to debug such programs are valgrind (doesn't require rebuilding the program) and Address Sanitizer (usually much faster and more precise, but requires rebuilding everything).
Upvotes: 0