Reputation: 119
an example;
main()
{
int x=123;
}
so, x
is an int
type variable and 2 byte is allocated for int
(current assumption, this may differ machine wise.)
Let's say, address 2000
and 2001
are allocated for x
. So, how is data 123
stored using these addresses?
I am a beginner in C, so simple language would be helpful.
I am referring "Computing Fundamentals & C Programming" by E Balagurusamy [Mc Graw Hill Education].
Upvotes: 1
Views: 2206
Reputation: 47933
When I think about storage of variables in C, mostly I think about machine-independent boxes. So given
int x = 123;
my first thought is just that it looks like this:
+-----------+
x: | 123 |
+-----------+
And since this is a local variable, I know that this little box is on the stack. (More on this below.)
Now, you asked about the individual bytes, and you imagined a two-byte int
starting at address 2000
. So here's how that would look, in more detail. To emphasize the contents of the individual bytes, I'm going to switch to hexadecimal:
+------+
x: 2000: | 0x7b |
+------+
2001: | 0x00 |
+------+
You've probably figured this out, but that number 0x7b
is part of the hexadecimal representation of 123
. The decimal number 123
has the hexadecimal representation 0x007b
. This does assume a two-byte integer, although it's worth noting that these days, most machines you're likely to use will use four bytes. I've also shown the situation for "little endian" storage, which is the convention that most machines use today. The lower-numbered byte is the least-significant byte.
Since 123
is actually just a 7-bit number, it occupies only one of the two bytes, and the other one is zero. To be really sure we understand how the two bytes are laid out, suppose we assign a new value to x:
x = 12345;
The hexadecimal representation of 12345
is 0x3039
, so the picture in memory changes to this:
+------+
x: 2000: | 0x39 |
+------+
2001: | 0x30 |
+------+
Finally, for comparison, suppose I said
long int y = 305419896;
And suppose this was on a machine with the opposite, big-endian byte order. Suppose that long int
s are four bytes, and suppose the compiler chose to put y
at address 3000. That random-looking number 305419896
has the hexadecimal representation 0x12345678
, so the situation in memory (again, assuming big-endian byte order) would look like this:
+------+
y: 3000: | 0x12 |
+------+
3001: | 0x34 |
+------+
3002: | 0x56 |
+------+
3003: | 0x78 |
+------+
With big-endian storage, the low-order address contains the most-significant byte, meaning that the number reads left-to-right (here top-to-bottom) in memory. But as I said, most machines you're likely to use today are little-endian.
As I mentioned, since x
in your example was a local variable, it will typically be stored on the function call stack. (As a commenter has pointed out, there's no guarantee that C even has a stack, but most do, so let's go with that assumption.) To see the difference between local and global variables, and also show how a couple of other datatypes are stored, let's look at a slightly bigger example, and let's expand our scope to imagine all of memory. Suppose I write
int g = 456;
char s[] = "hello";
int main() {
int x = 123;
char s2[] = "world";
char *p = s;
}
Typically, global variables are stored in the lower part of memory, while the stack is stored "at the top", and grows down. So we can imagine our computer's memory looking like this. As you'll see, I've reversed the convention I was using in the earlier picture. In this picture, memory addresses run up the page. (Also the memory addresses are in hexadecimal now, too, and I'm dropping the 0x
notation. Also I'm going back to little-endian byte order, but retaining the notion of a 16-bit machine. Also I'm going to show the character values as themselves, not in hex. Also I'm showing unknown bytes as ??
.)
+------+
ffec: | ?? |
+------+
ffeb: | 00 |
+------+
x: ffea: | 7b |
+------+
ffe9: | 00 |
+------+
ffe8: | 'd' |
+------+
ffe7: | 'l' |
+------+
ffe6: | 'r' |
+------+
ffe5: | 'o' |
+------+
s2: ffe4: | 'w' |
+------+
ffe3: | 02 |
+------+
p: ffe2: | 80 |
+------+
ffe1: | ?? |
+------+
.
.
.
+------+
0282: | ?? |
+------+
0281: | 00 |
+------+
0280: | 'o' |
+------+
0283: | 'l' |
+------+
0282: | 'l' |
+------+
0281: | 'e' |
+------+
s: 0280: | 'h' |
+------+
0283: | 01 |
+------+
g: 0282: | c8 |
+------+
0281: | ?? |
+------+
This is a huge simplification, of course, but it should give you the basic idea, and I still find it useful to think about it this way, despite all of the esoteric possible complications which the commenters are busy discussing. The next step might be to show how malloc
'ed memory looks, and how things look when we've got several function calls active on the stack, but this answer is getting too long, so we'll save that for another day.
Upvotes: 10
Reputation: 1
You should not care about how is the data stored, but you should think how your program behaves (so think more about semantics).
Your code is wrong. You need at least to have int main(void)
Be aware of possible optimizations by the compiler, and of the "as-if" rule.
Your compiler could:
forget the variable entirely. In your example, x
has no observable effect and a compiler could remove int x = 123;
store the variable only in some processor register (so the variable does not sit in memory and has no memory address)
put the variable in some slot of the current stack frame of your call stack (or perhaps elsewhere in memory).
etc... (including some mix of previous cases)
Of course, if you add some observable side-effect (perhaps a printf("x=%d\n", x);
as a statement after your int x = 123;
definition of an automatic variable), the compiler would handle that variable very differently.
The C standard (read n1570) specifies (in English) not only the syntax, but also the semantics of the C programming language, that is the observable behavior of programs. An important and tricky notion is that of undefined behavior (UB; the toy program in your question don't have any UB, but you'll soon code buggy programs which have some UB, and you need to learn how to avoid UB). Be scared of UB.
Some side-effects are not relevant. For example, heating your processor.
Current implementations (compilers and systems) can help you in understanding the behavior of your program. I recommend compiling with all warnings and debug info (so gcc -Wall -Wextra -g
with GCC) and using the gdb
debugger.
Upvotes: 6