rallen911
rallen911

Reputation: 148

Use of '#' in unexpected way

There's a macro defined as:

#define SET_ARRAY(field, type) \
  foo.field = bar[#field].data<type>();

foo is a structure with members that are of type int or float *. bar is of type cnpy::npz_t (data loaded from .npz file). I understand that the macro is setting the structure member pointer so that it is pointing to the corresponding data in bar from the .npy file contained in the .npz file, but I'm wondering about the usage bar[#field].

When I ran the code through the preprocessor, I get:

foo.struct_member_name = bar["struct_member_name"].data<float>();

but I've never seen that type of usage either. It looks like the struct member variable name is somehow getting converted to an array index or memory offset that resolves to the data within the cnpy::npz_t structure. Can anyone explain how that is happening?

Upvotes: 2

Views: 169

Answers (1)

TheNomad
TheNomad

Reputation: 1713

# is actually a preprocessor marker. That means preprocessor commands (not functions), formally called "preprocessor directives", are being executed at compile time. Apart from commands, you'll also find something akin to constants (meaning they have predefined values, either static or dynamic - yes I used the term constants loosely, but I am oversimplifying this right now), but they aren't constants "in that way", they just seem like that to us.

A number of preprocessor commands that you will find are: #define, #include, #undef, #if (yes, different from the normal "if" in code), #elif, #endif, #error - all those must be prefixed by a "#".

Some values might be the __FILE__, __LINE__, __cplusplus and more. These are not prefixed by #, but can be used in preprocessor macros. The values are dynamically set by the compiler, depending on context.

For more information on macros, you can check the MS Learn page for MSVS or the GNU page for GCC. For other preprocessor values, you can also see this SourceForge page.

And of course, you can define your own macro or pseudo-constants using the #define directive.

#define test_integer 7

Using test_integer anywhere in your code (or macros) will be replaced by 7 after compilation. Note that macros are case-sensitive, just like everything else in C and C++.

Now, let's talk about special cases of "#":

  • string-izing a parameter (also called "to stringify")

    What that means is you can pass a parameters and it is turned into a string, which is what happened in your case. An example:

    #define NAME_TO_STRING(x) #x
    
    std::cout << NAME_TO_STRING(Hello) << std::endl;
    

    This will turn Hello which is NOT a string, but an identifier, to a string.

  • concatenating two parameters

    #define CONCAT(x1, x2)          x1##x2
    #define CONCAT_STRING(x1, x2)   CONCAT(#x1,#x2)
    #define CONCATENATE(x1, x2)     CONCAT_STRING(x1, x2)
    

    (yes, it doesn't work directly, you need a level of indirection for preprocessor concatenation to work; indirection means passing it again to a different macro).

    std::cout << CONCATENATE(Hello,World) << std::endl;
    

    This will turn Hello and World which are identifiers, to a concatenated string: HelloWorld.

Now, regarding usage of # and ##, that's a more advanced topic. There are many use cases from macro-magic (which might seem cool when you see it implemented - for examples, check the Unreal Engine as it's extensively used there, but be warned, such programming methods are not encouraged), helpers, some constant definitions (think #define TERRA_GRAV 9.807) and even help in some compile-time checks, for example using constexpr from the newest standards.

If you're curious what is the advantage of using #define versus a const float or const double, it might also be to not be part of the code (there is no actual syntax check on macros if they are not used).

In regards to helper macros, the most common are defining exports when building a library (search __declspec for MSVS and __attribute__ for GCC), the old style inclusion limitators (now replaced by #pragma once) to stop a *.h, *.hxx or *.hpp from being included multiple times in projects and debug handling (search for _DEBUG and assertions on Google). This paragraph handles slightly more advanced topics so I won't cover them here.

I tried to keep the explanation as simple as possible, so the terminology is not that formal. But if you really are curious, I am sure you can find more details online or you can post a comment on this answer :)

Upvotes: 2

Related Questions