Reputation: 1099
I ask this question, because we're really stuck at finding the cause of a software crash. I know that questions like "Why does the software crash" are not appreciated, but we really don't know how to find the problem.
We currently do a longterm test of our software. To find potential memory leaks, we used the windows tool Performance monitor to track several memory metrics, such as Private bytes, Working set and Virtual bytes.
The software ran quite a long time (about 30 hours) without any problems. It does the same all the time, reading in an image from the harddrive, doing some inspection and showing some results.
Then suddenly it crashes. Inspecting the memory metrics in the performance monitor, we saw that strange steep rising of the working set bytes graph at 10.17AM. We encountered this several times and according to the dumpfiles, the exception code is always 0xc0000005 : "the thread tried to read from or write to a virtual address for which it does not have the appropriate access", but it appears at different positions, where no pointers are used.
Does someone know, what could be the cause of such a steep rise of the working set and why this could cause a software crash? How could we find out, if our software has a bug, when every time, the crash occurs the position of the crash is at another position?
The application is written in C++ and it runs on a windows 7 32bit pc.
Upvotes: 0
Views: 264
Reputation: 1638
Given information you have now, there is little chance to get an answer. You need more information, more specifically:
Get more intelligence (is there anything specific about that files which cause crash? What about last-but-one file?)
Insert more tracing and logging (as much as you can without making it 2x slower). At least you'll see where it crashes, and then will be able to insert more tracing/logging around that place
As you're on Windows - consider handling c0000005 via _set_se_translator, converting it into C++ exception, and even more logging on the way this exception is unwinded.
There is no silver bullet for this kind of problems, only gathering more information and figuring it out.
P.S. As an unlikely shot - I've seen similar things to be caused by a bug in MS heap; if you're not using LFH yet (not sure, it might be default now) - there is an 1% chance changing your default heap to LFH will help.
Upvotes: 1
Reputation: 3731
It's actually impossible to know from the information that you have provided, but I would suggest that you have some memory corruption (hence the access violation). It could be a buffer-overflow issue... for example there is a missing null
character from a string and so something is being appended indefinitely?
Recommended next step is to download the Debugging Tools for Windows suite. Setup WinDbg
with your correct symbol files, and analyse the stack trace, to find the general area of the crash. Depending on the cause of the memory corruption this will be more or less useful. You could have corrupted the memory a long time before your crash occurs.
Ideally also run a static analysis tool on the code.
Upvotes: 1