dcode
dcode

Reputation: 614

WebAssembly stack / stack pointer initialization and memory layout

I am currently toying around with WebAssembly compiled through LLVM but I haven't yet managed to understand the stack / stack pointer and how it relates to the overall memory layout.

I learned that I have to use s2wasm with --allocate-stack N to make my program run and I figured that this is basically adding (data (i32.const 4) "8\00\00\00") (with N=8) to my generated wast, with the binary part obviously being a pointer to a memory offset and the i32 constant being its offset in linear memory.

What I do not quite understand, though, is why the pointer's value is 56 (again with N=8) and how this value relates to the exact region of the stack in memory, which, in my case, currently looks like:

0-3: zero 4-7: 56 7-35: other data sections 36-55: zeroes 56-59: zero

I know that I am probably more a candidate for "just use emscripten", but I'd also like to understand this.

Upvotes: 2

Views: 3098

Answers (1)

JF Bastien
JF Bastien

Reputation: 6843

I touched on this in another question. From C++'s stack there are actually 3 places where the values can end up:

  1. On the execution stack (each opcode pushes and pops values, so add pops 2 and then pushes 1).
  2. As a local.
  3. In the Memory.

Notice that you can't take the address of 1. and 2. Only in these cases would I expect a code generator to go with 3. How this is done isn't dictated by WebAssembly, it's up to whatever ABI you chose. What Emscripten and other tools do is they store the stack pointer at address 4, and then very early in the program they choose a spot where the stack should go. It doesn't have to always be 4, but it's simpler to always stick to that ABI especially if dynamic linking is involved.

On initial value: that location has to be big enough to hold the whole stack, and the implementation of malloc has to know about it because it can't allocate heap space over it. That's why some tooling allows you to specify max size.

Anything can be stored before / after (though after you'd likely have prior stack values). WebAssembly doesn't currently have guard pages, so exhausting the in-memory stack will clobber heap values (unless the code generator also emits stack checks). That's all "memory safe" in that it still can't escape the WebAssembly.Memory, so the browser can't get owned but the developer's own code can totally be owned. A memory-safe language built on top of WebAssembly would have to enforce memory safety within the WebAssembly.Memory.

Note that I haven't explained 1. and 2. Their existence means that most C++ programs will use less in-memory stack in WebAssembly than a native C++ program uses stack.

Upvotes: 6

Related Questions