bodgesoc
bodgesoc

Reputation: 321

What can cause C code to crash when an array is initialised at declaration, but not crash if zeroed by a loop?

A bug has recently been "fixed" in a project I work on, but so far nobody has been able to explain why the fix works. (So is it really a fix?) The code is running in kernel space under a realtime system, so the problem causes a complete system lockup. This makes debugging harder than normal, too.

This version crashes the system:

int  dups[EMCMOT_MAX_AXIS] = {0};
char *coords = coordinates;
char coord_letter[] = {'X','Y','Z','A','B','C','U','V','W'};

This version does not crash

int  dups[EMCMOT_MAX_AXIS];
char *coords = coordinates;
char coord_letter[] = {'X','Y','Z','A','B','C','U','V','W'};
int  i;
for (i=0; i<EMCMOT_MAX_AXIS; i++) {dups[i] = 0;}

To really confuse matters, this experimental version also crashes

int  dups[EMCMOT_MAX_AXIS] = {0};
char *coords = coordinates;
char coord_letter[] = {'X','Y','Z','A','B','C','U','V','W'};
int  i;
for (i=0; i<EMCMOT_MAX_AXIS; i++) {dups[i] = 0;}

You can see the commit and the surrounding code here: https://github.com/LinuxCNC/linuxcnc/commit/ef6f36a16c7789af258d34adf4840d965f4c0b10

Upvotes: 2

Views: 184

Answers (1)

bodgesoc
bodgesoc

Reputation: 321

Thanks to Nate Eldredge for setting up the Compiler Explorer and 0andriy for the %xmm0 pointer. This does look like an issue with the compiler using register unsafe for kernel code (or some closely related issue). Experimenting with the Godbolt site I was able to find that the -mno-sse2 compiler flag has a similar effect to switching to gcc-6 in removing use of the $xmm0 register in that code. And, when added to the compiler flags in the actual application compilation it appears to solve the issue. Some more work is likely to be needed to get to the bottom of the right solution but we seem to have some good pointers now.

Upvotes: 2

Related Questions