Reputation: 33
I'm trying to do some memory-usage analysis of using compile-time constructed classes by looking at the artifacts generated by compilation. I've marked several class constructors with "constexpr" and made sure they're trivial to ensure compile-time construction. Viewing the map file, I can see that the constructors and destructor functions are not included in the .text section any more.
Where, though, do these classes appear? I had assumed that they would be included in the .data section as a static instance of the class, but that doesn't seem to be the case. The .text section has shrunk but all other sections seem to be the same. Where is this data going?
(I'm using GCC 5.2.0 and creating a statically-linked ELF.)
Edit: Here's a bit of sample code.
#include <stddef.h>
#include <stdint.h>
struct AbstractMemoryAccess
{
virtual uint32_t read() const = 0;
virtual void write(const uint32_t data) const = 0;
};
class ConcerteMemoryAccess : public AbstractMemoryAccess
{
public:
constexpr ConcerteMemoryAccess(const size_t baseAddress)
: _baseAddress(baseAddress)
{
// empty
}
virtual uint32_t read() const
{
return *(volatile uint32_t *)(_baseAddress);
}
virtual void write(const uint32_t data) const
{
*(volatile uint32_t *)(_baseAddress) = data;
}
private:
const size_t _baseAddress;
};
#define ARBITRARY_PERIPHERAL_ADDRESS 0x40001000
int main(void)
{
ConcerteMemoryAccess memoryAccessor(ARBITRARY_PERIPHERAL_ADDRESS);
AbstractMemoryAccess &rAbsMemoryAccessor = memoryAccessor;
while (1)
{
uint32_t readData = rAbsMemoryAccessor.read();
rAbsMemoryAccessor.write(readData);
}
return 0;
}
Which decompiles to this:
000006a4 <main>:
6a4: b0004000 imm 16384
6a8: e8601000 lwi r3, r0, 4096
6ac: b0004000 imm 16384
6b0: f8601000 swi r3, r0, 4096
6b4: b800fff0 bri -16 // 6a4 <main>
So it looks like it's inlining the memory accesses... but is that true for non-trivial cases? Are all calls on a constexpr object inlined?
Upvotes: 1
Views: 352
Reputation: 364160
The class instance variable doesn't exist anywhere; it's optimized away.
If it did exist anywhere, it would be on the stack, since it has automatic storage (a local variable in main
). If the constructor was optimized away but the object still had to exist in memory, it would probably be stored to stack memory from immediate data. So in the executable, it would be embedded into the instruction stream.
If the object itself was const
, gcc would be more likely to optimize it to static read-only storage, in the .rodata
section. This section is linked as part of the text
segment of an executable, before or after the executable code. (String literals go in .rodata
, too.)
For example, passing a pointer to something to an external function forces gcc to actually have it in memory:
void ext(const char*);
void foo_automatic_nonconst() {
char str [] = "abcdefghijklmnopq";
ext(str);
}
void foo_automatic_const() {
const char str [] = "abcdefghijklmnopq";
ext(str);
}
void foo_static_const() {
static const char str [] = "abcdefghijklmnopq";
ext(str);
}
void foo_static_nonconst() {
static char str [] = "abcdefghijklmnopq";
ext(str);
}
gcc5.2 -O3 for x86-64 on the Godbolt compiler explorer:
foo_automatic_nonconst():
sub rsp, 24
movabs rax, 7523094288207667809
mov QWORD PTR [rsp], rax
mov rdi, rsp
movabs rax, 31365138664352361
mov QWORD PTR [rsp+8], rax
call ext(char const*)
add rsp, 24
ret
foo_automatic_const():
... same asm as automatic_nonconst, unfortunately.
# I think it could have used a constant like static_const
foo_static_const():
mov edi, OFFSET FLAT:foo_static_const()::str
jmp ext(char const*)
foo_static_nonconst():
mov edi, OFFSET FLAT:foo_static_nonconst()::str
jmp ext(char const*)
.section .data
foo_static_nonconst()::str:
.string "abcdefghijklmno"
.section .rodata
foo_static_const()::str:
.string "abcdefghijklmno"
gcc will optimize floating-point constants into static storage, though:
float times3(float f) { return f * 3.0; }
mulss xmm0, DWORD PTR .LC4[rip]
ret
.section .rodata
.LC4:
.long 1077936128
The same applies to vector constants (like _mm_set_epi32(1.5, 3.0, 4.5, 6.0);
with Intel's SSE intrinsics.)
This isn't very directly answering your question, but maybe it will give you an idea for other things to look at. There shouldn't be much difference between classes and other kinds of objects, at least in cases where gcc can de-virtualize.
Upvotes: 0
Reputation: 51842
That code can be de-virtualized, which is why a vtable isn't needed and thus it's possible to compile the virtuals away to nothing. The compiler can see both classes, as they're in the same translation unit.
De-virtualization can also happen across translation units when using link-time optimization (LTO.)
So it looks like it's inlining the memory accesses... but is that true for non-trivial cases? Are all calls on a constexpr object inlined?
No. It's just that in this example, it's rather easy to de-virtualize the functions. Once that happens, there's no virtual
anymore and no vtable to go through, so the usual optimizations of non-virtual functions kick in, and stuff can compile away into thin air.
Upvotes: 1