Tim Williams
Tim Williams

Reputation: 193

Is there a limit to the length of a macro's contents?

For embedded programs, I often convert data tables into header #defines, which get dropped into variables/arrays in the .c program.

I've just written a conversion tool that can potentially produce massive output in this format, and now I'm wondering if I should be aware of any limitations of this pattern.

Header example:

#define BIG_IMAGE_BLOCK      \
    0x00, 0x01, 0x02, 0x03,  \
    0x04, 0x05, 0x06, 0x07,  \
    /* this goes on ... */   \
    0xa8, 0xa9, 0xaa, 0xab

Code example (avr-gcc):

const uint8_t ImageData[] PROGMEM = {
    BIG_IMAGE_BLOCK
};

Can't seem to find an answer to this particular question, seems drowned out by everyone asking about identifier, line length and macro re-evaluation limits.

Upvotes: 2

Views: 3491

Answers (2)

Peter
Peter

Reputation: 36617

C17 Section 5.2.4.1, clause 1, lists a number of minimum translation limits. This means implementations are permitted, but not required, to exceed those limits. In the quote below, I've omitted a couple of references to footnotes, and highlighted one that is most likely relevant to this question.

The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:

— 127 nesting levels of blocks

— 63 nesting levels of conditional inclusion

— 12 pointer, array, and function declarators (in any combinations) modifying an arithmetic, structure, union, or void type in a declaration

— 63 nesting levels of parenthesized declarators within a full declarator

— 63 nesting levels of parenthesized expressions within a full expression

— 63 significant initial characters in an internal identifier or a macro name(each universal character name or extended source character is considered a single character)

— 31 significant initial characters in an external identifier (each universal character name specifying a short identifier of 0000FFFF or less is considered 6 characters, each universal character name specifying a short identifier of 00010000 or more is considered 10 characters, and each extended source character is considered the same number of characters as the corresponding universal character name, if any)

— 4095 external identifiers in one translation unit

— 511 identifiers with block scope declared in one block

— 4095 macro identifiers simultaneously defined in one preprocessing translation unit

— 127 parameters in one function definition

— 127 arguments in one function call

— 127 parameters in one macro definition

— 127 arguments in one macro invocation

4095 characters in a logical source line

— 4095 characters in a string literal (after concatenation)

— 65535 bytes in an object (in a hosted environment only)

— 15 nesting levels for #included files

— 1023 case labels for a switch statement (excluding those for any nested switch statements)

— 1023 members in a single structure or union

— 1023 enumeration constants in a single enumeration

— 63 levels of nested structure or union definitions in a single struct-declaration-list

Relevance of the number of characters in a logical source line comes about because expansion of a macro will be into a single logical source line. For example, if \ is used in a macro definition to indicate a multi-line macro, all the parts are spliced into a single source line. This is required by Section 5.1.1.2, clause 1, second item.

Depending on how the macro is defined, it may be affected by other limits as well.

Practically, all implementations (compilers and their preprocessors) do exceed these limits. For example, the allowed length of a logical source line for the gnu compiler is determined by available memory.

Upvotes: 4

Eric Postpischil
Eric Postpischil

Reputation: 223264

The C standard is very lax about specifying such limits. A C implementation must be able to translate “at least one program” with 4095 characters on a logical source line (C 2018 5.2.4.1). However, it may fail in other situations with shorter lines. The length of macro replacement text (measured in either characters or preprocessor tokens) is not explicitly addressed.

So, C implementations may have limits on the lengths of macro replacement text and other text, but it is not controlled by the C standard and often not well documented, or documented at all, by C implementations.

A common technique for preparing complicated or massive data needed in source code is to write a separate program to be executed at compile time to process the data and write the desired source text. This is generally preferable to abusing the C preprocessor features.

Upvotes: 2

Related Questions