Reputation: 41
I am trying to implement a simple RTOS with round robin scheduling. Since I do not have a physical board yet, I am running the ELF file on QEMU (qemu-system-gnuarmlinux). For development I am using Eclipse CDT. I use the following command to run the code on QEMU:
/opt/xpack-qemu-arm-7.0.0-1/bin/qemu-system-gnuarmeclipse -M STM32F4-Discovery -kernel /mnt/d/eclipse-workspace/rtos/Debug/rtos.elf
Each task has an associated struct:
struct TCB {
int32_t *stackPt;
struct TCB *nextPt;
};
At initialization, the structs are chained up in a circular linked list via the nextPt
, their stacks (stackPt
) are set as TCB_STACK[threadNumber][STACK_SIZE-16];
and the stack's program counter is set up as TCB_STACK[0][STACK_SIZE - 2] = (int32_t)(taskA);
. The current thread's pointer is maintained as: currentTcbPt
.
Then the systick is set up to interrupt at every 10ms. An assembly setup function sets up initial stack pointer to the thread stack pointed to by currentTcbPt
. This function is as follows:
osSchedulerLaunch: // This routine loads up the first thread's stack pointer into SP
CPSID I
LDR R0,=currentTcbPt
LDR R2,[R0] // R2 = address of current TCB
LDR SP,[R2]
POP {R4-R11}
POP {R0-R3}
POP {R12}
ADD SP,SP,#4 // Skip 4 bytes to discard LR
POP {LR}
ADD SP,SP,#4 // Skip 4 bytes to discard PSR
CPSIE I
BX LR
Now, my SysTick_Handler looks like this:
__attribute__( ( naked ) ) void SysTick_Handler(void) {
__asm(
"CPSID I \n"
"PUSH {R0-R12} \n"
"LDR R0,=currentTcbPt \n"
"LDR R1,[R0] \n"
"STR SP,[R1] \n"
"LDR R1,[R1,#4] \n"
"STR R1,[R0] \n"
"LDR SP,[R1] \n"
"POP {R4-R11} \n"
"POP {R0-R3} \n"
"POP {R12} \n"
"ADD SP,SP,#4 \n"
"POP {LR} \n"
"ADD SP,SP,#4 \n"
"CPSIE I \n"
"BX LR \n"
:[currentTcbPt] "=&r" (currentTcbPt)
);
}
I have added extra register operations so I can use it as a normal function.
OS init
Launching scheduler
t2
t2
[UsageFault]
Stack frame:
R0 = 00000003
R1 = 2000008C
R2 = 00000000
R3 = 000004B8
R12 = 00000000
LR = 0800148D
PC = 000004B8
PSR = 20000000
FSR/FAR:
CFSR = 00000000
HFSR = 00000000
DFSR = 00000000
AFSR = 00000000
Misc
LR/EXC_RETURN= FFFFFFF9
On examining the asm code using -d in_asm
option in QEMU and also using remote gdb, The problem seems to happen at the first line of the next task (the same address in PC
above).
EDIT: See the full code to reproduce https://gist.github.com/shivangsgangadia/b78c7c66492d5332c7b4d1806be9c5f6
The order of execution of function would be something like:
RTOS rtos();
rtos.addThreads(&task_a, &task_b, &task_c);
rtos.osKernelLaunch();
Upvotes: 1
Views: 552
Reputation: 41
The problem was setting the T-bit in the Execution PSR register for individual task stacks. The course content that I was following skipped over the fact that the PSR is composed of 4 bytes, and the T-bit in the most significant byte.
Initially, I was setting it as: TCB_STACK[threadNumber][STACK_SIZE-1] = (1U << 6);
. This was causing the BX LR
to not pick up the return address properly.
Setting the 6th bit in the 4th byte, i.e., TCB_STACK[threadNumber][STACK_SIZE-1] = (1U << 24);
solved the problem and now the scheduler works flawlessly.
Upvotes: 0
Reputation: 11473
Your SysTick_Handler code seems to be scrambling the order of registers. The instruction PUSH {R0-R12} pushes the registers to the stack in the order r0, r1, r2, ... r12, with r0 at the lowest address and r12 at the highest address. But when you execute these instructions in order:
"POP {R4-R11} \n"
"POP {R0-R3} \n"
"POP {R12} \n"
"ADD SP,SP,#4 \n"
"POP {LR} \n"
"ADD SP,SP,#4 \n"
it will load r4..r11 from the lowest addresses and move SP up. Then it loads r0..r3 from the next 4 addresses, then r12, before skipping one slot and then loading lr and skipping another slot. So you'll not get back to where you were before.
Secondly, this careful setting up of the registers is going to be partially overwritten when the exception handler returns. Arm M-profile exception handlers work by having the CPU push state to the stack on entry and pop it on exception return. (This is done so that in theory you can write an exception handler as a C function, because the stack pushes and pops match the C calling convention for saved registers.) So when you do the "BX LR" at the end of SysTick_Handler() this will cause the hardware to reload R0-R3, R12, PC and the PSR from the stack. If you hadn't changed SP this would be reloading them with the values they had on exception entry; since you have changed SP they'll be pulled from the new stack, and unless you set that bit of the stack up to look like an exception-entry frame this will be garbage. There are also some integrity checks done as part of exception return which you may be falling over if you have set SP to point to something else and not set up that stack correctly.
Finally, disabling interrupts inside the timer interrupt handler looks odd -- the hardware will prevent you from taking an interrupt or exception of same or lower priority as the interrupt handler you're in already.
Overall, I would suggest you read the M-profile architecture manual's description of how M-profile interrupt and exception handling works, because your code looks like maybe you expect it to work like A-profile interrupts where PC is set to the entry point and it's the interrupt handler's responsibility to save and restore all registers.
For debugging this kind of thing I would recommend enabling some of QEMU's '-d' debugging flags. These can be a bit tricky to interpret, but in particular '-d int' will tell you what QEMU is doing in the interrupt entry and exit and why it has decided to raise a UsageFault. (You'll probably want to add some other -d options too, like perhaps 'cpu,exec', to give some context for the int logging.)
Upvotes: 1