Denis Beurive
Denis Beurive

Reputation: 435

Where are "str" values allocated? Is it in the heap?

I am new to Rust and I try to figure out where things are allocated.

I use Rust 1.64.0 on Ubuntu 22.04.0 (jammy):

$ rustc --version
rustc 1.64.0 (a55dd71d5 2022-09-19)
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:    22.04
Codename:   jammy

I try to explore the executable produce by compiling this very basic Rust program:

fn tester() {
    let borrowed_string: &str = "world";
    println!("{}", borrowed_string);
}

fn main() { tester(); }

Compilation (OK, this is overkill... but I want to be sure that I have all I need for using GDB):

$ cargo build --config "profile.dev.debug=true" --config "profile.dev.opt-level=0" --profile=dev
  Compiling variables v0.1.0 (/home/denis/Documents/github/rust-playground/variables)
   Finished dev [unoptimized + debuginfo] target(s) in 0.71s

Run rust-gdb, set the brakpoint and run the program:

$ rust-gdb target/debug/variables 
GNU gdb (Ubuntu 12.0.90-0ubuntu1) 12.0.90
...
Reading symbols from target/debug/variables...
(gdb) b 3
Breakpoint 1 at 0x7959: file src/main.rs, line 3.
(gdb) r
Starting program: /home/denis/Documents/github/rust-playground/variables/target/debug/variables 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, variables::tester () at src/main.rs:3
3       println!("{}", borrowed_string);
(gdb) 
s

Print the memory mapping;

(gdb) info proc map
process 6985
Mapped address spaces:

          Start Addr           End Addr       Size     Offset  Perms  objfile
      0x555555554000     0x55555555a000     0x6000        0x0  r--p   /home/denis/Documents/github/rust-playground/variables/target/debug/variables
      0x55555555a000     0x555555591000    0x37000     0x6000  r-xp   /home/denis/Documents/github/rust-playground/variables/target/debug/variables
      0x555555591000     0x55555559f000     0xe000    0x3d000  r--p   /home/denis/Documents/github/rust-playground/variables/target/debug/variables
      0x5555555a0000     0x5555555a3000     0x3000    0x4b000  r--p   /home/denis/Documents/github/rust-playground/variables/target/debug/variables
      0x5555555a3000     0x5555555a4000     0x1000    0x4e000  rw-p   /home/denis/Documents/github/rust-playground/variables/target/debug/variables
      0x5555555a4000     0x5555555c5000    0x21000        0x0  rw-p   [heap]
      0x7ffff7d5c000     0x7ffff7d5f000     0x3000        0x0  rw-p   
      0x7ffff7d5f000     0x7ffff7d87000    0x28000        0x0  r--p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7d87000     0x7ffff7f1c000   0x195000    0x28000  r-xp   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f1c000     0x7ffff7f74000    0x58000   0x1bd000  r--p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f74000     0x7ffff7f78000     0x4000   0x214000  r--p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f78000     0x7ffff7f7a000     0x2000   0x218000  rw-p   /usr/lib/x86_64-linux-gnu/libc.so.6
      0x7ffff7f7a000     0x7ffff7f87000     0xd000        0x0  rw-p   
      0x7ffff7f87000     0x7ffff7f8a000     0x3000        0x0  r--p   /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
      0x7ffff7f8a000     0x7ffff7fa1000    0x17000     0x3000  r-xp   /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
      0x7ffff7fa1000     0x7ffff7fa5000     0x4000    0x1a000  r--p   /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
      0x7ffff7fa5000     0x7ffff7fa6000     0x1000    0x1d000  r--p   /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
      0x7ffff7fa6000     0x7ffff7fa7000     0x1000    0x1e000  rw-p   /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
      0x7ffff7fb8000     0x7ffff7fb9000     0x1000        0x0  ---p   
      0x7ffff7fb9000     0x7ffff7fbd000     0x4000        0x0  rw-p   
      0x7ffff7fbd000     0x7ffff7fc1000     0x4000        0x0  r--p   [vvar]
      0x7ffff7fc1000     0x7ffff7fc3000     0x2000        0x0  r-xp   [vdso]
      0x7ffff7fc3000     0x7ffff7fc5000     0x2000        0x0  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7fc5000     0x7ffff7fef000    0x2a000     0x2000  r-xp   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7fef000     0x7ffff7ffa000     0xb000    0x2c000  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ffb000     0x7ffff7ffd000     0x2000    0x37000  r--p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ffd000     0x7ffff7fff000     0x2000    0x39000  rw-p   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0  rw-p   [stack]
  0xffffffffff600000 0xffffffffff601000     0x1000        0x0  --xp   [vsyscall]
(gdb) 

We have:

HEAP:  [0x5555555A4000, 0x5555555C4FFF]
STACK: [0x7FFFFFFDE000, 0x7FFFFFFFEFFF]

I assume that the data "world" is allocated in the heap. Thus, let's find out... but look at what I obtain!!!

(gdb) find 0x5555555A4000, 0x5555555C4FFF, "world"
0x5555555a4ba0
1 pattern found.
(gdb) find 0x5555555A4000, 0x5555555C4FFF, "world"
0x5555555a4be0
1 pattern found.
(gdb) find 0x5555555A4000, 0x5555555C4FFF, "world"
0x5555555a4c20
1 pattern found.
(gdb) find 0x5555555A4000, 0x5555555C4FFF, "world"
0x5555555a4c60
1 pattern found.
(gdb) find 0x5555555A4000, 0x5555555C4FFF, "world"
0x5555555a4ca0
1 pattern found.
(gdb) 

Please note that:

0x5555555A4BA0 -> 0x5555555A4BE0 => 64
0x5555555A4BE0 -> 0x5555555A4C20 => 64
0x5555555A4C20 -> 0x5555555A4C60 => 64

Question: why does the position of the data change from one scan to the next?

(This question may not be related specifically to Rust.)

(gdb) p borrowed_string
$1 = "world"
(gdb) ptype borrowed_string
type = struct &str {
  data_ptr: *mut u8,
  length: usize,
}
(gdb) p borrowed_string.data_ptr
$2 = (*mut u8) 0x555555591000
(gdb) x/5c 0x555555591000
0x555555591000: 119 'w' 111 'o' 114 'r' 108 'l' 100 'd'

We have:

Question: what is this "uncharted" memory territory between 0x5555555C5000 (included) and 0x7FFFF7D5F000 (excluded)?

EDIT:

As mentioned in the comments below, I made a mistake. The data "world" is "before" the heap... The comments make sense. Thank you.

Upvotes: 4

Views: 278

Answers (1)

Masklinn
Masklinn

Reputation: 42217

Question: what is this "uncharted" memory territory between 0x5555555C5000 (included) and 0x7FFFF7D5F000 (excluded)?

It's nothing. It's memory which is not mapped.

Where are "str" values allocated? Is it in the heap?

Your memory map answers the question:

0x555555591000     0x55555559f000     0xe000    0x3d000  r--p   /home/denis/Documents/github/rust-playground/variables/target/debug/variables

This is one of the areas where the binary itself is mapped. Most likely the "rodata" segment of the executable. String literals are generally "allocated" in the binary itself, then a static reference is loaded.

More generally str values are not allocated, they're a reference to data living somewhere else which could be on the heap (if the reference was obtained from a String or Box<str> or some pointers), in the binary (literals, static includes), or even on the stack (e.g. you can create an array on the stack then pass it through std::str::from_utf8 and get an &str out of it)

We have: HEAP: [0x5555555A4000, 0x5555555C4FFF]

That's... not really helpful or useful or true, frankly: modern systems have largely moved away from linear heaps and to memory map heaps. While Linux software still makes some limited use of (s)brk, BSDs qualify them as "historical curiosities". If you assume the heap is contained within these bounds you're going to face lots of surprises.

Question: why does the position of the data change from one scan to the next?

Because the debugger is allocating stuff in context of the program, I assume? So you might be finding the data gdb itself is storing.

Upvotes: 7

Related Questions