Joe
Joe

Reputation: 41

Using GDB and checking the memory layout of Data

Assume we have a simple C++ code as the following:

#include <iostream>

int main(){
  int a = 5;
}

Since each memory location is 8 bits and an integer is 32 bits I assume the memory structure for a would be like this:

0xa      0xb      0xc      0xd 
00000000 00000000 00000000 00000101

where 0xa,0xb,0xc,0xd are sample memory addresses.

1) is &a pointing to 0xa or 0xd?

2) if I use GDB and and use x to get real memory addresses I get the following:

(gdb) p a
$7 = 5
(gdb) p &a
$8 = (int *) 0x7ffeefbffac8
(gdb) x/bt 0x7ffeefbffac8
0x7ffeefbffac8: 00000101
(gdb) x/bt 0x7ffeefbffac8-1
0x7ffeefbffac7: 00000000
(gdb) x/bt 0x7ffeefbffac8-2
0x7ffeefbffac6: 00000000
(gdb) x/bt 0x7ffeefbffac8-3
0x7ffeefbffac5: 01111111
(gdb) 

why is 0x7ffeefbffac8-3 populated with 01111111 and not 00000000? ins't this address equal to 0xa in our sample memory address?

Upvotes: 4

Views: 3795

Answers (2)

2785528
2785528

Reputation: 5566

2) if I use GDB and and use x to get real memory addresses I get the following:

On most desktops, and Linux in particular, the address shown is virtual, not 'real' (not actual).

In Embedded tool suites (such as vxWorks), even with virtual memory, the debugger can show hw addresses and values.

Note: I have not yet used any form of Linux on a system with actual hw addresses to access, but I have used g++ and gdb on embedded software.


1) is &a pointing to 0xa or 0xd?

A C++ code snippet can show both int and byte addresses and values, in hex or dec.

     int a = 0x0d0c0b0a;
     //  msB---^^    ^^---lsB

     char* a0 = reinterpret_cast<char*>(&a);
     char* a1 = a0+1;
     char* a2 = a0+2;
     char* a3 = a0+3;

     cout  //              Note: vvvvvvvvvvvvv---improves readability 
        << "\n  value of a: " << sop.digiComma(to_string(a))
        << "\n  sizeof(a):  " << sizeof(a) << " bytes   "
        << "\n  address:    " << &a << '\n'
        << "\n  hex value:  " << "0x" << hex << setfill('0') << setw(8) << a << hex
        //
        << "\n              " <<                                   "   | | | |"
        << "\n         a0:  " << setw(2) << static_cast<int>(*a0) << " | | |-^ lsB  " << static_cast<void*>(a0)
        << "\n         a1:  " << setw(2) << static_cast<int>(*a1) << " | |-^        " << static_cast<void*>(a1)
        << "\n         a2:  " << setw(2) << static_cast<int>(*a2) << " |-^          " << static_cast<void*>(a2)
        << "\n         a3:  " << setw(2) << static_cast<int>(*a3) << "-^       msB  " << static_cast<void*>(a3)
        << endl;

Typical Output: (location of a can change)

value of a: 218,893,066
sizeof(a):  4 bytes   
address:    0x7ffee713c1dc

  hex value:  0x0d0c0b0a
                 | | | |
         a0:  0a | | |-^ lsB  0x7ffee713c1dc
         a1:  0b | |-^        0x7ffee713c1dd
         a2:  0c |-^          0x7ffee713c1de
         a3:  0d-^       msB  0x7ffee713c1df

why is 0x7ffeefbffac8-3 populated with 01111111 and not 00000000? ins't this address equal to 0xa in our sample memory address?

Another answer (referring to the -3) says, "you are going in the wrong direction", and I agree. To me, this is simply your misunderstanding of how an object is 'laid out' in memory.

And this illustrates an issue with all debuggers ... the successful user must know how the compiler did things, how it 'laid out' simple objects in memory. The code snippet I have written shows, using simple c++ code, one way of getting the compiler to illustrate it's choices for the layout.

Summary:

You can easily add diagnostic routines to show memory layout for inspection and content review, each using the comfortable features of c++ (or c-style if you must).

You can easily get the debugger to report the current address of an object.

Thus, you might consider combining these two ideas:

a) I have often created illustrative code snippets, similar to above, to show in simple text, my compilers memory layout of objects I want to confirm or review. Note that changes to compiler options can change layout choices.

b) With the above, I also create a short-named access function, to be invoked on the debugger command line. The access function invokes the illustration code.

c) There can be challenges on how to get the function to invoke the illustrative code, but software is very flexible, and I have had no problems.

d) Sometimes, I found it easier to pass the object address into the function (as part of the command line). Other times, the single address was implied.

e) typically, the access function is the only code to invoke the illustrative code, and thus, both are cut-out from operational code. i.e. they has no impact to normal operation (and thus easily removed)

Upvotes: 0

Employed Russian
Employed Russian

Reputation: 213764

On a little-endian machine, &a points to the least significant byte of memory. That is, if &a == 0x7ffeefbffac8, then a resides in bytes

0x7ffeefbffac8:  101   << least significant byte
0x7ffeefbffac9:  000
0x7ffeefbffaca:  000
0x7ffeefbffacb:  000   << most significant byte.

This is best observed by assigning e.g. 0x0a090705 to a, and then:

Temporary breakpoint 1, main (argc=3, argv=0x7fffffffdc68) at t.c:2
2     int a = 0x0a090705;
(gdb) n
3     return 0;
(gdb) p &a
$1 = (int *) 0x7fffffffdb7c

Examine 4 bytes starting from &a:

(gdb) x/4bt 0x7fffffffdb7c
0x7fffffffdb7c: 00000101    00000111    00001001    00001010

Or, equivalently, do so one byte at a time:

(gdb) x/bt 0x7fffffffdb7c
0x7fffffffdb7c: 00000101
(gdb) x/bt 0x7fffffffdb7c+1
0x7fffffffdb7d: 00000111
(gdb) x/bt 0x7fffffffdb7c+2
0x7fffffffdb7e: 00001001
(gdb) x/bt 0x7fffffffdb7c+3
0x7fffffffdb7f: 00001010

why is 0x7ffeefbffac8-3 populated with 01111111 and not 00000000?

Because you are going in the wrong direction: &a-3 isn't part of a at all (it's part of something else, or possibly uninitialized random garbage).

Upvotes: 4

Related Questions