Reputation: 245
Say I am having a macro
#define MSG "Input your first name"
and a const
const char* const msg = "Input your last name"
or
const std::string msg = "Input your last name"
in the same program.
Now, msg
string literal will have a memory location which will be referred to by every reference of msg
in the program.
But does the same apply to MSG
, i.e., does every occurrence of MSG
refer to same string literal or actually different string literals are created for each occurrence?
My guess is that since macros are handled by Preprocessor, duplicate string literals might be created (not 100% sure). Is that true? I am sure that duplicacy won't matter if it's integral type.
My question is specific to storage in memory, but other aspects are also welcome.
In other words, say I am using msg
100 times, but memory utilized is constant, but is memory utilization constant or 100 times if MSG
is used 100 times?
Upvotes: 1
Views: 240
Reputation: 16406
But does the same apply to MSG, i.e., does every occurrence of MSG refer to same string literal
The question is meaningless because MSG does not "refer to" anything. The preprocessor simply does token replacement ... where you type MSG it's just as if you had typed "Input your first name" instead. So what memory is used depends on where you type it; e.g.,
char* a = MSG;
char* b = MSG;
char* c = "Input your first name";
produces one copy of the string (in a typical implementation that uses a string pool, but the standard doesn't require it), but
char a[] = MSG;
char b[] = MSG;
char c[] = "Input your first name";
produces three copies of the string. (Although, depending on exactly how you use them, the compiler might optimize them into one or two copies, or even no copies.)
Additionally, consider
char* twice = MSG MSG;
which allocates one string containing two copies of MSG. I think this shows most clearly that the notion that MSG "refers to" something is a misconception ... your question conflates two quite different issues, macro expansion and string spooling.
Upvotes: 1
Reputation: 7994
g++ in linux, refers to the same location for MSG
or a const char *
if the content is the same
Given:
#include <stdio.h>
#define MSG "Input your last name"
int main()
{
const char* const msgc = "Input your last name";
printf("MACRO %p\n", &MSG);
printf("char %p\n", msgc);
printf("MACRO %p\n", &MSG);
}
The disassembly of the above
(gdb) disassemble main
Dump of assembler code for function main():
0x000000000040070c <+0>: push rbp
0x000000000040070d <+1>: mov rbp,rsp
0x0000000000400710 <+4>: sub rsp,0x10
0x0000000000400714 <+8>: mov QWORD PTR [rbp-0x8],0x400864
0x000000000040071c <+16>: mov esi,0x400864
0x0000000000400721 <+21>: mov edi,0x400879
0x0000000000400726 <+26>: mov eax,0x0
0x000000000040072b <+31>: call 0x4005c0 <printf@plt>
0x0000000000400730 <+36>: mov esi,0x400864
0x0000000000400735 <+41>: mov edi,0x400883
0x000000000040073a <+46>: mov eax,0x0
0x000000000040073f <+51>: call 0x4005c0 <printf@plt>
0x0000000000400744 <+56>: mov esi,0x400864
0x0000000000400749 <+61>: mov edi,0x400879
0x000000000040074e <+66>: mov eax,0x0
0x0000000000400753 <+71>: call 0x4005c0 <printf@plt>
0x0000000000400758 <+76>: mov eax,0x0
0x000000000040075d <+81>: leave
0x000000000040075e <+82>: ret
End of assembler dump.
0x400864
in this case is "Input your last name"
and msgc
and MSG
point to the same location.
Upvotes: 0
Reputation: 171117
Each place in code where you use the macro MSG
will contain the literal "Input your first name"
after preprocessing. However, whether this text will be present in the binary several times or just once depends entirely on your compiler. Quoting [lex.string]§12
:
Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. The effect of attempting to modify a string literal is undefined.
In other words, the compiler (and/or linker) is free to put the text data into the binary image just once, and have all appearnces of the literal in code refer to the same data.
Upvotes: 1
Reputation: 5882
If the string is repeated 100 times in the binary then the size of the binary in memory will be greater - but it won't affect the amount of used heap.
As for if the string will be repeated 100 times using a #define? Yes it certainly will, if you view the pre-processor output of your source you will see this. However some compilers may then remove the duplicates in a later step (linking I would assume). This feature is called string pooling, MSVC reference is here:
http://msdn.microsoft.com/en-us/library/s0s0asdt(v=vs.110).aspx
Upvotes: 4
Reputation: 63190
A macro gets replace by its actual content every place it occurs by the preprocessor. So by the time the compiler gets to your code, your MSG
will have been replace with the actual string every time it occurred, meaning that this string will be hardcoded in your code base.
What the compiler then does with multiple occurrances of the same string, is dependent on compiler settings etc, but probably will store it once and then refer to it wherever it occurs.
Upvotes: 3
Reputation:
My guess is that since macros are handled by Preprocessor, duplicate string literals might be created
They might, but in practice, almost every modern compiler will merge identical string literals into one, so that every different instance of "foo"
will indeed have the same memory address. But however often this may be done by optimizing compilers: don't rely on it.
Upvotes: 0