Jignesh Parmar
Jignesh Parmar

Reputation: 13

Why string constant stored in .rodata and the address lie in code section?

Here my simple code is,

int main()
{
    const char *str="jigneshparmar";
    printf("address of str data:%p , address of str variable:%p\n",(void*)str,(void*)&str );
    getchar();

    return 0;
}

here the string constant "jignesh" store in read only data section.

by using size command here the output of size is:-

gcc datasec.c -o datasec
size -A datasec

datasec  :
section              size    addr
.interp                28     792
.note.gnu.property     32     824
.note.gnu.build-id     36     856
.note.ABI-tag          32     892
.gnu.hash              36     928
.dynsym               168     968
.dynstr               133    1136
.gnu.version           14    1270
.gnu.version_r         32    1288
.rela.dyn             192    1320
.rela.plt              24    1512
.init                  27    4096
.plt                   32    4128
.plt.got               16    4160
.plt.sec               16    4176
.text                 405    4192
.fini                  13    4600

.rodata                18    8192

.eh_frame_hdr          68    8212
.eh_frame             264    8280
.init_array             8   15800
.fini_array             8   15808
.dynamic              496   15816
.got                   72   16312
.data                  16   16384
.bss                    8   16400
.comment               42       0
Total                2236

this .rodata size is increase when i increase the string constant.

and the address of the str which I have print that is belongs to code section. the

./datasec 
address of str data:0x55f5301f3004 ,address of str variable:0x7ffd0a2b1940

the address 0x55f5301f3004 lie in code section.

cat /proc/4018/maps
555da289d000-555da289e000 r--p 00000000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da289e000-555da289f000 r-xp 00001000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da289f000-555da28a0000 r--p 00002000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da28a0000-555da28a1000 r--p 00002000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da28a1000-555da28a2000 rw-p 00003000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da416c000-555da418d000 rw-p 00000000 00:00 0                          [heap]
7f5485c38000-7f5485c5d000 r--p 00000000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485c5d000-7f5485dd5000 r-xp 00025000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485dd5000-7f5485e1f000 r--p 0019d000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e1f000-7f5485e20000 ---p 001e7000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e20000-7f5485e23000 r--p 001e7000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e23000-7f5485e26000 rw-p 001ea000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e26000-7f5485e2c000 rw-p 00000000 00:00 0 
7f5485e41000-7f5485e42000 r--p 00000000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e42000-7f5485e65000 r-xp 00001000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e65000-7f5485e6d000 r--p 00024000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e6e000-7f5485e6f000 r--p 0002c000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e6f000-7f5485e70000 rw-p 0002d000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e70000-7f5485e71000 rw-p 00000000 00:00 0 
7ffda6e9c000-7ffda6ebd000 rw-p 00000000 00:00 0                          [stack]
7ffda6ed9000-7ffda6edd000 r--p 00000000 00:00 0                          [vvar]
7ffda6edd000-7ffda6edf000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]


how this is possible.?

Thanks in advance.

Upvotes: 0

Views: 1782

Answers (3)

Dražen Grašovec
Dražen Grašovec

Reputation: 802

You probably have wrong addresses somehow. I tried your code and got results that string constant address str is in .rodata section, and address of that local variable str is on stack.

This is expected from assembler code. Address of string constant in .rodata section is computed as relative address (to enable position independent code) as address of the next instruction plus offset 0x5ed and is put in $rax register. Next instruction puts this address 0x555555556008 on the stack:

   0x00005555555551a4 <+27>:    lea    0xe5d(%rip),%rax        #0x555555556008
   0x00005555555551ab <+34>:    mov    %rax,-0x10(%rbp)

We now have str (address of string literal in .rodata) placed on stack. Check it with gdb that 0x555555556008 which is value of str is placed on stack:

(gdb) x/32gx $sp
0x7fffffffdd00: 0x0000000000000000  0x0000000100000000
0x7fffffffdd10: 0x0000000000000000  0x7af4b6fe7bb81800
0x7fffffffdd20: 0x0000000400000003  0x0000555555556008
0x7fffffffdd30: 0x00007fffffffdde0  0x0000555555557db0

We check that this is indeed address of string:

(gdb) x/s 0x0000555555556008
0x555555556008: "jigneshparmar"

You can see in this screenshot that address of str is on stack:

enter image description here

So this is the real output of program. We expect that addresses will not be the same as when we run program with gdb:

drazen@HP-ProBook-640G1:~/proba$ ./main
address of str data:0x564ff8d34008 , address of str variable:0x7ffffac9fc80

And process map is:

enter image description here

So we can confirm that str points to .rodata and &str points to a stack

Upvotes: 0

Gl&#228;rbo
Gl&#228;rbo

Reputation: 145

The Linux kernel "loads" ELF executables by mapping them into memory. This occurs at page granularity, and the offset field (the field after permissions) in /proc/PID/maps describes the offset into the file for each mapped region.

String literals are stored in ELF files in the .rodata section, and executable code in the .text section.

The Linux kernel does use the section headers to determine what to map. The ELF file format has a set of program headers that you can see with e.g. readelf -l binary. The relevant ones here are the LOAD ones; these specify what the Linux kernel maps into memory.

For example, here are the two LOAD program headers from GNU Coreutils 8.28 cat on x86-64 (readelf -l /bin/cat):

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000079d0 0x00000000000079d0  R E    0x200000
  LOAD           0x0000000000007a70 0x0000000000207a70 0x0000000000207a70
                 0x0000000000000650 0x00000000000007f0  RW     0x200000

When the Linux kernel executes an ELF file, it maps the LOAD program headers into memory. See how there are just read-and-execute for 0x0-0x79d0, and read-write for 0x6a60-0x8260 (memory addresses 0x207a70-0x208260), and no "read-only" at all?

Using objdump -d -s /bin/cat, we see (relevant snippets only):

0000000000001ad0 <.text>:
    1ad0:       53                      push   %rbx
    1ad1:       48 8d 35 6c 41 00 00    lea    0x416c(%rip),%rsi        # 5c44 <_IO_stdin_used@@Base+0x4>
    1ad8:       ba 05 00 00 00          mov    $0x5,%edx
    1add:       31 ff                   xor    %edi,%edi
    1adf:       e8 3c fd ff ff          callq  1820 <dcgettext@plt>
    1ae4:       48 89 c3                mov    %rax,%rbx
    1ae7:       e8 a4 fc ff ff          callq  1790 <__errno_location@plt>
[snipped lots of disassembly]
    5c16:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    5c1d:       00 00 00 
    5c20:       31 d2                   xor    %edx,%edx
    5c22:       31 f6                   xor    %esi,%esi
    5c24:       e9 17 be ff ff          jmpq   1a40 <__cxa_atexit@plt>

and

Contents of section .rodata:
 5c40 01000200 77726974 65206572 726f7200  ....write error.
 5c50 63617400 5b007465 73742069 6e766f63  cat.[.test invoc
 5c60 6174696f 6e004d75 6c74692d 63616c6c  ation.Multi-call
 5c70 20696e76 6f636174 696f6e00 73686132   invocation.sha2
 5c80 32347375 6d007368 61322075 74696c69  24sum.sha2 utili
 5c90 74696573 00736861 32353673 756d0073  ties.sha256sum.s
 5ca0 68613338 3473756d 00736861 35313273  ha384sum.sha512s
 5cb0 756d000a 2573206f 6e6c696e 65206865  um..%s online he
 5cc0 6c703a20 3c25733e 0a00474e 5520636f  lp: <%s>..GNU co
 5cd0 72657574 696c7300 656e5f00 2f757372  reutils.en_./usr
 5ce0 2f736861 72652f6c 6f63616c 65005269  /share/locale.Ri
 5cf0 63686172 64204d2e 20537461 6c6c6d61  chard M. Stallma
 5d00 6e00546f 72626a6f 726e2047 72616e6c  n.Torbjorn Granl
 5d10 756e6400 62656e73 74757641 45540073  und.benstuvAET.s
 5d20 74616e64 61726420 6f757470 75740025  tandard output.%

and

Contents of section .init_array:
 207a70 10280000 00000000                    .(......        
Contents of section .fini_array:
 207a78 d0270000 00000000                    .'......        
Contents of section .data.rel.ro:
 207a80 7a5d0000 00000000 00000000 00000000  z]..............
 207a90 00000000 00000000 62000000 00000000  ........b.......
 207aa0 8a5d0000 00000000 00000000 00000000  .]..............
[...]
 207c00 e75c0000 00000000 27630000 00000000  .\......'c......
 207c10 00000000 00000000                    ........        
Contents of section .dynamic:
 207c18 01000000 00000000 01000000 00000000  ................
 207c28 0c000000 00000000 20170000 00000000  ........ .......
 207c38 0d000000 00000000 2c5c0000 00000000  ........,\......
[...]
 207de8 00000000 00000000 00000000 00000000  ................
 207df8 00000000 00000000 00000000 00000000  ................
Contents of section .got:
 207e08 187c2000 00000000 00000000 00000000  .| .............
 207e18 00000000 00000000 56170000 00000000  ........V.......
[...]
 207fe8 00000000 00000000 00000000 00000000  ................
 207ff8 00000000 00000000                    ........        
Contents of section .data:
 208000 00000000 00000000 08802000 00000000  .......... .....
 208010 20202020 20202020 20202020 20202020                  
 208020 20300900 00000000 21802000 00000000   0......!. .....
 208030 1c802000 00000000 7b620000 00000000  .. .....{b......

See how both .text and .rodata belong to the same LOAD program header? That's why they're mapped the same way. The .data section is a separate LOAD program header, and is therefore mapped separately, with different permissions.

You see, the linker file used on x86-64 on most Linux systems (including Ubuntu 18.04.5, where the above is from), combines the .text and .rodata sections into a single LOAD program header; and since this is what controls how ELF executables are load (mapped) into memory by the Linux kernel, they get mapped into the same memory region, with the same permissions (r-xp).


Consider the following example program, maps.c:

// SPDX-License-Identifier: CC0-1.0
#define  _POSIX_C_SOURCE  200809L
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

struct map_entry {
    struct map_entry   *next;
    uintptr_t           addr;
    uintptr_t           ends;
    char                line[];
};

struct map_entry *map = NULL;

static struct map_entry *map_find(const void *const ptr)
{
    const uintptr_t   addr = (uintptr_t)ptr;
    struct map_entry *curr = map;

    while (curr)
        if (addr >= curr->addr && addr <= curr->ends)
            return curr;
        else
            curr = curr->next;

    return NULL;
}

static void map_init(void)
{
    char   *line = NULL;
    size_t  size = 0;
    ssize_t len;
    FILE   *in;

    /* Already mapped? */
    if (map)
        return;

    struct map_entry *root = NULL;

    in = fopen("/proc/self/maps", "r");
    if (!in) {
        fprintf(stderr, "Cannot read /proc/self/maps: %s.\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    while (1) {
        len = getline(&line, &size, in);
        if (len < 0)
            break;

        /* Remove newline at end. */
        while (len > 0 && line[len-1] == '\n')
            line[--len] = '\0';

        char               *ptr = line;
        char               *end = line;
        unsigned long long  val;

        /* Parse start address. */
        errno = 0;
        val = strtoull(ptr, &end, 16);
        if (errno) {
            fprintf(stderr, "/proc/self/maps: %s: %s.\n", line, strerror(errno));
            exit(EXIT_FAILURE);
        }
        if (end == ptr || *end != '-') {
            fprintf(stderr, "/proc/self/maps: %s: Error parsing line.\n", line);
            exit(EXIT_FAILURE);
        }
        ptr = ++end;
        const uintptr_t  addr = val;

        /* Parse end address (actually one plus end address). */
        errno = 0;
        val = strtoull(ptr, &end, 16);
        if (errno) {
            fprintf(stderr, "/proc/self/maps: %s: %s.\n", line, strerror(errno));
            exit(EXIT_FAILURE);
        }
        if (end == ptr || *end != ' ') {
            fprintf(stderr, "/proc/self/maps: %s: Error parsing line.\n", line);
            exit(EXIT_FAILURE);
        }
        const uintptr_t  ends = val;

        /* Allocate a new map entry for this one. */
        struct map_entry *ent = malloc(sizeof (struct map_entry) + len + 1);
        if (!ent) {
            fprintf(stderr, "/proc/self/maps: Out of memory.\n");
            exit(EXIT_FAILURE);
        }

        /* Copy line, including the end-of-string '\0'. */
        memcpy(ent->line, line, len + 1);

        ent->addr = addr;
        ent->ends = ends - 1;

        /* Prepend to root list. */
        ent->next = root;
        root      = ent;
    }

    /* Discard line buffer, since it is no longer needed. */
    free(line);  /* Note: free(NULL) is safe, and does nothing. */

    if (ferror(in) || !feof(in)) {
        fprintf(stderr, "/proc/self/maps: Read error.\n");
        exit(EXIT_FAILURE);
    }
    if (fclose(in)) {
        fprintf(stderr, "/proc/self/maps: Error closing file.\n");
        exit(EXIT_FAILURE);
    }

    /* Reverse the list.  Since we prepended each entry, it is in reverse order. */
    while (root) {
        struct map_entry *curr = root;

        root = root->next;

        /* Prepend to map list. */
        curr->next = map;
        map        = curr;
    }
}

const char *const literal1 = "String literal 1";
const char        array1[] = "String array 1";

int main(void)
{
    const char *const literal2 = "String literal 2";
    const char        array2[] = "String array 2";

    struct map_entry *ent;
    map_init();

    ent = map_find(&literal1);
    if (ent)
        printf("Variable 'literal1' has address %p:\n\t%s\n", (void *)&literal1, ent->line);

    ent = map_find(literal1);
    if (ent)
        printf("Variable 'literal1' points to address %p:\n\t%s\n", (void *)literal1, ent->line);

    ent = map_find(&array1);
    if (ent)
        printf("Variable 'array1' has address %p:\n\t%s\n", (void *)&array1, ent->line);

    ent = map_find(array1);
    if (ent)
        printf("Variable 'array1' points to address %p:\n\t%s\n", (void *)array1, ent->line);

    ent = map_find(&literal2);
    if (ent)
        printf("Variable 'literal2' has address %p:\n\t%s\n", (void *)&literal2, ent->line);

    ent = map_find(literal2);
    if (ent)
        printf("Variable 'literal2' points to address %p:\n\t%s\n", (void *)literal2, ent->line);

    ent = map_find(&array2);
    if (ent)
        printf("Variable 'array2' has address %p:\n\t%s\n", (void *)&array2, ent->line);

    ent = map_find(array2);
    if (ent)
        printf("Variable 'array2' points to address %p:\n\t%s\n", (void *)array2, ent->line);

    return EXIT_SUCCESS;
}

Compile it using gcc -Wall -Wextra -O2 maps.c -o maps, and run ./maps. Its output is

Variable 'literal1' has address 0x5651567add48:
    5651567ad000-5651567ae000 r--p 00001000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'literal1' points to address 0x5651565ad21f:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'array1' has address 0x5651565ad448:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'array1' points to address 0x5651565ad448:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'literal2' has address 0x7fff34c15dd8:
    7fff34bf7000-7fff34c18000 rw-p 00000000 00:00 0                          [stack]
Variable 'literal2' points to address 0x5651565ad1c4:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'array2' has address 0x7fff34c15df9:
    7fff34bf7000-7fff34c18000 rw-p 00000000 00:00 0                          [stack]
Variable 'array2' points to address 0x7fff34c15df9:
    7fff34bf7000-7fff34c18000 rw-p 00000000 00:00 0                          [stack]

which shows how literal1 (const char *const literal1 = "...";) belongs to memory region that is mapped r--p, but points to memory that is mapped r-xp.

("Hey, I thought you said the kernel only maps the LOAD program headers?" Yes; that particular mapping was not created by the kernel, but by the dynamic linker. I did not say that only the kernel maps ELF executables into memory; I explained how the kernel maps the minimum necessary LOAD program headers into memory and hands off the execution to that code. For dynamically linked C programs using standard C libraries, that code maps the rest of the program sections and any prerequisite dynamic libraries.)

Note however the array1 immutable char array, however, is completely in the r-xp mapped memory region, as is the string literal that literal2 refers to.

Because array2 and literal2 are declared in the main() function, they reside in the [stack] memory region.

Upvotes: 2

John Bode
John Bode

Reputation: 123458

One of two things is happening:

  1. You're not printing the address of the string literal correctly;
  2. You're not looking at your mapping correctly.

As I mentioned in my comment, %p expects its corresponding argument to have type void *, and a call to printf is one of the few (perhaps the only) places in C where you have to explicitly cast a pointer to void *, so it's possible the address for the string literal isn't being formatted correctly.

Otherwise, you're not looking at your mapping correctly.

I took your code and built it on my system. When I run it I get the output

address of str data:0x400580 , address of str variable:0x7fff22b9e938

You can use the objdump utility to look at the sections of your executable file - to look at the contents of .rodata, do the following:

objdump -s -j .rodata file

When I do that on the code I built, I get

Contents of section .rodata:
 400570 01000200 00000000 00000000 00000000  ................
 400580 6a69676e 65736870 61726d61 72000000  jigneshparmar...
 400590 61646472 65737320 6f662073 74722064  address of str d
 4005a0 6174613a 2570202c 20616464 72657373  ata:%p , address
 4005b0 206f6620 73747220 76617269 61626c65   of str variable
 4005c0 3a25700a 00                          :%p..

which matches the output from the program.

Upvotes: 1

Related Questions