Reputation: 83
I have this code to parse JSON. The structure has a key, val and pointer to next structure. Due to nesting, the val pointer points sometimes to jss structure.
The code below
struct jss {
uint8_t type;
char *key;
char *val;
struct jss *next;
};
void my_f() {
...
struct jss *js = (struct jss *)malloc(sizeof(struct jss));
...
while(js) {
struct jss *js1 = (struct jss *)js->val;
...
}
}
compiles and runs fine and has this assembly:
struct jss *js = (struct jss *)malloc(sizeof(struct jss));
4ea: bf 20 00 00 00 mov $0x20,%edi
4ef: e8 00 00 00 00 callq 4f4 <Init+0x407>
4f4: 48 89 45 e8 mov %rax,-0x18(%rbp)
...
char *t, *f, *h;
struct jss *js1 = ((struct jss *)(js->val));
522: 48 8b 45 e8 mov -0x18(%rbp),%rax
526: 48 8b 40 10 mov 0x10(%rax),%rax
52a: 48 89 45 b0 mov %rax,-0x50(%rbp)
We see that rbp-0x18, which has the addr of js structure is moved to rax, rax then adds 0x10 to point to js->val address and the result is stored in rbp-0x50 which holds the js1. So far, so good!
But if I change the code to this (js1 is replaced by js):
struct jss {
uint8_t type;
char *key;
char *val;
struct jss *next;
};
void my_f() {
...
struct jss *js = (struct jss *)malloc(sizeof(struct jss));
...
while(js) {
char *t, *f, *h;
struct jss *js = (struct jss *)js->val;
...
}
}
I have this assembly:
struct jss *js = (struct jss *)malloc(sizeof(struct jss));
4ea: bf 20 00 00 00 mov $0x20,%edi
4ef: e8 00 00 00 00 callq 4f4 <Init+0x407>
4f4: 48 89 45 e8 mov %rax,-0x18(%rbp)
...
char *t, *f, *h;
struct jss *js = ((struct jss *)(js->val));
522: 48 8b 45 c8 mov -0x38(%rbp),%rax
526: 48 8b 40 10 mov 0x10(%rax),%rax
52a: 48 89 45 c8 mov %rax,-0x38(%rbp)
Which compiles fine but segfaults: Instead of loading the address of js structure (rbp-0x18) into rax, the loaded address is that of the new structure I create...then there is no surprise why it segfaults.
The question is what is illegal about the second code. I know about variable shadowing and this is indeed my intention. Why the compiler gets confused (I use gcc) ?
Upvotes: 0
Views: 254
Reputation: 310990
According to the C Standard (6.2.1 Scopes of identifiers)
7 Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator.
So within this while statement
while(js) {
char *t, *f, *h;
struct jss *js = (struct jss *)js->val;
...
}
there is declared identifier js
that refers to itself in the initializer. That is in the initializer expression there is used indeterminate value of the declared identifier js
that hides the object with the same identifier .declared in the outer scope before the while statement.
Upvotes: 0
Reputation: 50774
Consider this line of your code:
struct jss *js = (struct jss *)js->val;
// ^ ^
// | |
// this js and this js are the same
You declare js
and then you dereference js
. The second js
is the same variable than the one being declared and it is of course not initialized hence the segfault.
If you have
struct jss *js1 = (struct jss *)js->val;
then js
refers to the js
declared in the outer scope, which is what you want.
It is the exact same situation as in this simpler situation:
int foo = 3;
...
{
int foo = foo;
... // you expect foo to be three here, but actually
// you're just assigning the unininitialized foo to itself
}
BTW clang issues a very explicit warning in this situation but gcc does not.
Upvotes: 4