Reputation: 344

Embedded: memcpy/memset not used by most CRT startup code ― why?

Context:
I'm working on an ARM target, more specifically a Cortex-M4F microcontroller from ST. When working on such platforms (microcontrollers in general), there's obviously no OS; in order to get a working C/C++ "environment" (moreover, to be standard compliant in regard to initialization of variables) there must be some kind of startup code run at reset that does the minimum setup required before explicitly calling main. Such startup code, as I hinted, must initialize initialized global and static variables (such as int foo = 42;at global scope) and zero-out the other globals (such as int bar; at global scope). Then, if necessary, global "ctors" are called.

On a microcontroller, that simply means that the startup code has to copy data from flash to ram for every initialized global (all in section '.data') and clear the others (all in '.bss'). Because I use GCC, I must supply such a startup code and I happily analyzed several startup codes (and its associated linker script!) bundled with numerous examples I've found on the Internet, all using the same demo board I'm developing on.

Question:
As stated, I've seen numerous startup codes, and they initialize globals in different ways, some more efficient in term of space and time than others. But they all have something odd in common: they didn't use memset nor memcpy, resorting instead to hand-written loops to do the job. As it appears natural to me to use standard functions when possible (simple "DRY principle"), I tried the following in lieu of the initial hand-written loops:

/* Initialize .data section */
ldr r0, DATA_LOAD
ldr r1, DATA_START
ldr r2, DATA_SIZE
bl  memcpy       /* memcpy(DATA_LOAD, DATA_START, DATA_SIZE); */

/* Initialize .bss section */
ldr r0, BSS_START
mov r1, #0
ldr r2, BSS_SIZE
bl  memset       /* memset(BSS_START, 0, BSS_SIZE); */

... and it worked perfectly. The space saving are negligible, but it is clearly dead simple now.

So, I thought about it, and I see no reason to do hand-written loops in this case:

memcpy and memset are very likely to be linked in the executable anyway, because the programmer would use it directly, or indirectly through another library;
It is smaller;
Speed is not a very important factor for startup code, but nevertheless it is likely faster;
It's nearly impossible to get it wrong.

Any idea why one wouldn't rely on memcpy and memset for startup code?

Upvotes: 18

Answers (3)

Lundin

Reputation: 214300

I don't think this is likely to have anything to do with "assumptions about the internal state of memcy/memset", they are unlikely to use any global resources (though I suppose some odd cases exist where they do).

All start up code on microcontrollers is usually written "inline assembler" in this manner, simply because it runs at an early stage in the code, where a stack might not yet be present and the MMU setup may not yet have been executed. Init code therefore doesn't want to risk putting anything on the stack, simple as that. Function calls put things on the stack.

So while this happened to be the initialization code of the static storage copy-down, you are likely to find the same inline assembler in other such init code as well. For example you will likely find some fundamental register setup code written in assembler somewhere before the copy-down, and you will also find the MMU setup in assembler somewhere around there too.

Upvotes: 1

Clifford

Reputation: 93534

Whether the standard library is linked at all is decision for the application developer (--nostdlib may be used for example), but the start-up code is required, so it cannot make any assumptions.

Further, the purpose of the start-up code is to establish an environment in which C code can run; before that is complete, it is by no means a given that any library code that might reasonably assume a complete run-time environment will run correctly. For the functions in question this is perhaps not an issue in many cases, but you cannot know that.

The start-up code has to at least establish a stack and initialise static data, in C++ it additionally calls the constructors of global static objects. The standard library might reasonably assume those are established, so using the standard library before then may conceivably result in erroneous behaviour.

Finally you should be clear that the C language and the C standard library are distinct entities. The language must necessarily be capable of standing alone.

Upvotes: 15

R.. GitHub STOP HELPING ICE

Reputation: 215387

I suspect the startup code does not want to make assumptions about the implementation of memcpy and such in libc. For example, the implementation of memcpy might use a global variable set by libc initialization code to report which cpu extensions are available, in order to provide optimized SIMD copying on machines that support such operations. At the point where the early "crt" startup code is running, the storage for such a global might be completely uninitialized (containing random junk), in which case it would be dangerous to call memcpy. Even if making the call works for you, it's a consequence of the implementation (or maybe even the unpredictable results of UB...) making it work; this is probably not something the crt code wants to depend on.

Upvotes: 20

Embedded: memcpy/memset not used by most CRT startup code ― why?

Answers (3)

Related Questions