Reputation: 787
I am writing a simple multitasking OS for the ARM Cortex M3. My threads always run using the Process Stack Pointer. I have an application that I inherited and that uses global variables. I am trying to call the functions in that application from my threading code but it is not accessing memory correctly. Are the following statements correct:
Those global variables are accessed via some kind of relative addressing, and that relative address is placed on the Main stack (using MSP)?
My threading code, using PSP, will never be able to access them
I need to switch to MSP when calling these functions, then back to PSP when using my threads?
**EDIT: Clarified that this is for a Cortex M
Upvotes: 1
Views: 415
Reputation: 71566
Global variables have nothing to do with the stack, even static locals.
So you need to just look at the output of the compiler, it will tell you everything.
Your question is very vague you could be asking one of many different questions. I will show some basics and maybe I will get lucky.
Note that this should in general have nothing to do with the processor, mode, etc. arm, thumb, x86, whatever. Much more to do with the toolchain.
If this is too basic and you are asking some very advanced question it is not obvious to me I will delete or rewrite, no problem.
Throwaway code is always a good idea to figure things out.
flash.s
.thumb
.syntax unified
.word 0x20001000
.word reset
.thumb_func
reset:
bl notmain
b .
notmain.c
unsigned int x;
unsigned int y=5;
void notmain ( void )
{
unsigned int z=7;
x=++y;
z--;
}
flash.ld
MEMORY
{
rom : ORIGIN = 0x00080000, LENGTH = 0x00001000
ram : ORIGIN = 0x20000000, LENGTH = 0x00001000
}
SECTIONS
{
.text : { *(.text) } > rom
.bss : { *(.bss) } > ram
.data : { *(.data) } > ram
}
build
arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
arm-none-eabi-gcc -Wall -O2 -ffreestanding -mcpu=cortex-m0 -c notmain.c -o notmain.o
arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld flash.o notmain.o -o flash.elf
arm-none-eabi-objdump -D flash.elf > flash.list
arm-none-eabi-objcopy -O binary flash.elf flash.bin
examine
Disassembly of section .text:
00080000 <reset-0x8>:
80000: 20001000 andcs r1, r0, r0
80004: 00080009 andeq r0, r8, r9
00080008 <reset>:
80008: f000 f802 bl 80010 <notmain>
8000c: e7fe b.n 8000c <reset+0x4>
...
00080010 <notmain>:
80010: 4b04 ldr r3, [pc, #16] ; (80024 <notmain+0x14>)
80012: 4905 ldr r1, [pc, #20] ; (80028 <notmain+0x18>)
80014: 681a ldr r2, [r3, #0]
80016: 3201 adds r2, #1
80018: 601a str r2, [r3, #0]
8001a: 600a str r2, [r1, #0]
8001c: 685a ldr r2, [r3, #4]
8001e: 3a01 subs r2, #1
80020: 605a str r2, [r3, #4]
80022: 4770 bx lr
80024: 20000004 andcs r0, r0, r4
80028: 20000000 andcs r0, r0, r0
Disassembly of section .bss:
20000000 <x>:
20000000: 00000000 andeq r0, r0, r0
Disassembly of section .data:
20000004 <y>:
20000004: 00000005 andeq r0, r0, r5
20000008 <z.3645>:
20000008: 00000007 andeq r0, r0, r7
This is basic not relocatable, etc.
80010: 4b04 ldr r3, [pc, #16] ; (80024 <notmain+0x14>)
80014: 681a ldr r2, [r3, #0]
80016: 3201 adds r2, #1
80018: 601a str r2, [r3, #0]
80024: 20000004 andcs r0, r0, r4
Disassembly of section .data:
20000004 <y>:
20000004: 00000005 andeq r0, r0, r5
We can see the y++. r3 gets the address to y, r2 gets the value of y r2 increments, and then is saved back to memory.
And you can see how x and z are handled as well.
Now this cannot work for an mcu for a couple of reasons. The 0x20000000 address information will not be there. Only what is in non-volatile storage will be there when the chip powers up and comes out of reset. The above is relevant depending on what your real question is.
MEMORY
{
rom : ORIGIN = 0x00080000, LENGTH = 0x00001000
ram : ORIGIN = 0x20000000, LENGTH = 0x00001000
}
SECTIONS
{
.text : { *(.text) } > rom
.bss : { *(.bss) } > ram AT > rom
.data : { *(.data) } > ram AT > rom
}
The program does not change, but the binary does
00000000 00 10 00 20 09 00 08 00 00 f0 02 f8 fe e7 00 00 |... ............|
00000010 04 4b 05 49 1a 68 01 32 1a 60 0a 60 5a 68 01 3a |.K.I.h.2.`.`Zh.:|
00000020 5a 60 70 47 04 00 00 20 00 00 00 20 05 00 00 00 |Z`pG... ... ....|
00000030 07 00 00 00 |....|
00000034
At 0x2C we see the preload value for y and at 0x30 for z.
The .bss value is not located here. Normally what you do is add a whole lot more stuff to the linker script to get the addresses of things. Data start and stop, and bss start and size or stop. Then a bootstrap that copies from flash to ram so that the initialized values are in ram and the read/write works.
So if your project, call it an operating system or not, is just one large body of code that is compiled and linked all together. Then without doing special things like lots of sections or something. The above is what you are looking at and the stack is not related to globals. Because it never is normally.
(msp/psp does not work the way arm implies they do, I have yet to see a use case for the second stack pointer, IF the processor even has it they do not all have it implemented)
Now if your threads are actually separately built programs that you load runtime...Then they completely live in ram. So
MEMORY
{
rom : ORIGIN = 0x00080000, LENGTH = 0x00001000
ram : ORIGIN = 0x20000000, LENGTH = 0x00001000
}
SECTIONS
{
.text : { *(.text) } > ram
.bss : { *(.bss) } > ram
.data : { *(.data) } > ram
}
and we add -fPIC
arm-none-eabi-gcc -Wall -O2 -ffreestanding -mcpu=cortex-m0 -fPIC -c notmain.c -o notmain.o
Disassembly of section .text:
20000000 <reset-0x8>:
20000000: 20001000 andcs r1, r0, r0
20000004: 20000009 andcs r0, r0, r9
20000008 <reset>:
20000008: f000 f802 bl 20000010 <notmain>
2000000c: e7fe b.n 2000000c <reset+0x4>
...
20000010 <notmain>:
20000010: 4a07 ldr r2, [pc, #28] ; (20000030 <notmain+0x20>)
20000012: 4b08 ldr r3, [pc, #32] ; (20000034 <notmain+0x24>)
20000014: 447a add r2, pc
20000016: 58d1 ldr r1, [r2, r3]
20000018: 680b ldr r3, [r1, #0]
2000001a: 3301 adds r3, #1
2000001c: 600b str r3, [r1, #0]
2000001e: 4906 ldr r1, [pc, #24] ; (20000038 <notmain+0x28>)
20000020: 5852 ldr r2, [r2, r1]
20000022: 6013 str r3, [r2, #0]
20000024: 4a05 ldr r2, [pc, #20] ; (2000003c <notmain+0x2c>)
20000026: 447a add r2, pc
20000028: 6813 ldr r3, [r2, #0]
2000002a: 3b01 subs r3, #1
2000002c: 6013 str r3, [r2, #0]
2000002e: 4770 bx lr
20000030: 00000034 andeq r0, r0, r4, lsr r0
20000034: 00000004 andeq r0, r0, r4
20000038: 00000000 andeq r0, r0, r0
2000003c: 0000001a andeq r0, r0, sl, lsl r0
Disassembly of section .bss:
20000040 <x>:
20000040: 00000000 andeq r0, r0, r0
Disassembly of section .data:
20000044 <z.3645>:
20000044: 00000007 andeq r0, r0, r7
20000048 <y>:
20000048: 00000005 andeq r0, r0, r5
Disassembly of section .got:
2000004c <.got>:
2000004c: 20000040 andcs r0, r0, r0, asr #32
20000050: 20000048 andcs r0, r0, r8, asr #32
Disassembly of section .got.plt:
20000054 <_GLOBAL_OFFSET_TABLE_>:
...
Because you may need to be able to load the program anywhere in ram (within rules).
The code is all relative, but the data because of the nature of compiling and linking needs some hardcoding. So they setup a global offset table GOT. The location of the got is relative to the code, you cannot change that.
20000010: 4a07 ldr r2, [pc, #28] ; (20000030 <notmain+0x20>)
20000012: 4b08 ldr r3, [pc, #32] ; (20000034 <notmain+0x24>)
20000014: 447a add r2, pc
20000016: 58d1 ldr r1, [r2, r3]
20000018: 680b ldr r3, [r1, #0]
2000001a: 3301 adds r3, #1
2000001c: 600b str r3, [r1, #0]
There is your y++ when built position independent.
r2 gets an offset, r3 gets another offset. r2 is the relative offset to the got from the code, (you cannot separate them and move one around and not the other, not what position independent means) so now r2 points to the GOT. r3 is the offset in the GOT to the address of y. r1 gets the address of y and now it is like before get y in r3, add one, save y to memory.
Now IF you were to relocate this to an address that is not 0x20000000 your bootstrap needs to go to the GOT and patch up all the addresses so you need linker magic to get where the got is and how bit it is, etc...Use the pc to figure out where you are and then make the adjustments. If loaded into memory at 0x20002000 then you need to add 0x2000 to each of the entries in the table and then it will all just work. (still no stack stuff, stack is not related).
A little trick if you have the space.
Notice I put bss before data, and I have at least one .data item. If you can guarantee that (force a .data in your bootstrap for example).
00000000 00 10 00 20 09 00 00 20 00 f0 02 f8 fe e7 00 00 |... ... ........|
00000010 07 4a 08 4b 7a 44 d1 58 0b 68 01 33 0b 60 06 49 |.J.KzD.X.h.3.`.I|
00000020 52 58 13 60 05 4a 7a 44 13 68 01 3b 13 60 70 47 |RX.`.JzD.h.;.`pG|
00000030 34 00 00 00 04 00 00 00 00 00 00 00 1a 00 00 00 |4...............|
00000040 00 00 00 00 07 00 00 00 05 00 00 00 40 00 00 20 |............@.. |
00000050 48 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 |H.. ............|
00000060
20000040 <x>:
20000040: 00000000 andeq r0, r0, r0
Objdump pads the binary for a -O binary with zeros for .bss If you put it last then it is not assumed to work.
So I do not know how this code you have uses threads and globals, does it try to keep variables specific to each thread? If so does it use static locals up front then pass the address on the stack (and even there the stack pointer you use does not matter unless you are not properly using the stack in general, if not then globals are not your problem.).
If you start off the thread or any code on one stack pointer and implying completely separate stacks (memory address spaces). And then switch, abandoning stack information needed for the code to work in and out of functions, and then if you return from functions after switching stacks all the code would break not just pointers to static locals that are passed along.
So a minimal example that demonstrates the problem can confirm for us what is really going on and what your questions really are and what the problem is. If you want to use the two stack pointers for a cortex-m you need to carefully read up and you need to also write some throwaway code examples to see how it works, and then apply that to the code the tools are generating.
Again if this is too elementary and I am miles away from the real question, I will certainly delete this no problem.
Upvotes: 2