Reputation: 95
I am working on an R package that contains a binary, compiled from C/C++ code. Something in that compiled code causes random crashes on Windows (7, 64bit), but not on Linux (various systems and configurations). My R version is 2.15.0.
I am not sure how to debug it, since I can't figure out the information given when R crashes:
Problem signature:
Problem Event Name: BEX64
Application Name: Rterm.exe
Application Version: 2.150.58871.0
Application Timestamp: 4f75a75a
Fault Module Name: StackHash_2264
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 00000000
Exception Offset: 0000000000000000
Exception Code: c0000005
Exception Data: 0000000000000008
OS Version: 6.1.7601.2.1.0.256.1
Locale ID: 1037
Additional Information 1: 2264
Additional Information 2: 2264db07e74365624c50317d7b856ae9
Additional Information 3: 875f
Additional Information 4: 875fa2ef9d2bdca96466e8af55d1ae6e
Can I learn anything from the fact the issue is in the StackHash module?
Some additional information:
As per the R documentation, I ran Valgrind on Linux, and it reported no issues. I tried the "gctorture" feature, but it did not appear to affect the behavior of the bug in any way.
I use pthreads in my code to utilize multicore CPUs. When I disable the use of multithreading (using a preprocessor define that I have), the problem seems to go away, but I can't be sure if this really eliminates the problem, or simply makes it less likely to occur.
I am not using that much memory that should create trouble on the machine I'm using. I also have some recursive calls, but again seems way too little that it would overflow the stack, unless threads get very limited stacks on Windows?
Due to the multithreading randomness, and the low probability of the bug, I am having a very hard time isolating it using prints to the console or log files.
Any pointers would be much appreciated
Upvotes: 2
Views: 624
Reputation: 129374
I would load up R in a debugger, and just run it until it crashes, and then see where it's at.
The error is clearly a null-pointer access (0000000000000008 is referencing a NULL) - I'm pretty sure it's not a stack problem.
You should be able to see where it crashes.
Upvotes: 2