Moshe
Moshe

Reputation: 58097

What's a Good Way to Test that Identifiers aren't Being Truncated and Thereby Mixed Up?

In C++ class today, we discussed the maximum possible length of identifiers, and how the compiler will eventually stop treating variables as different, after a certain length. (My professor seems to have implied that really long identifiers are truncated.) I posted another question earlier, hoping to see if the limit is defined somewhere. My question here is a little different. Suppose I wanted to test either a practical or enforced limit on identifier name lengths. How would I go about doing so? Here's what I'm thinking of doing, but somehow it seems to be too simple.

Am I approaching this correctly? Will I run out of memory before I "break" the compiler or "runtime"?

Upvotes: 4

Views: 542

Answers (4)

MSalters
MSalters

Reputation: 180245

Your professor is wrong. § 2.11/1 of the C++ standard says: "All characters are significant". Certainly compilers may impose a limit on the allowed length, as noted in your other question. That doesn't mean they can ignore characters after that.

He's probably confusing C and C++. The two languages have similar but not identical rules. Historically, C had limits as low as six significant characters.

As for your test, there's a far simpeler way to test your hypothesis. Note that

int a;
int a;

is illegal, because you define the same identifier twice. Now if ReallyLongNameA and ReallyLongNameB would differ only in non-significant characters, then

int ReallyLongNameA;
int ReallyLongNameB;

would also be a compile-time error, because both would declare the same variable. You don't need to run the code. You can just generate test.cpp with those two lines, and try to compile it. So, write a small test program that creates increasingly long identifier names, write them to test.cpp, and call system("path/to/compiler -compileroptions test.cpp"); to see if it compiles.

Upvotes: 4

AShelly
AShelly

Reputation: 35600

I don't think you need to even generate any operations on the variables.

The following code will generate a redefinition error at compilation time;

int name;
int name;

I'd expect you'd get the same error with

int namewithlastsignificantcharacterhere_abc;
int namewithlastsignificantcharacterhere_123;

I'd use a macro scripting language to generate successively longer names until you got one that broke. Here's a Ruby one-liner

C:>ruby -e "(1..2048).each{|i| puts \"int #{'variable'*i}#{i};\"}" > var.txt

When I #include var.txt in a c file, and compile with VS2008, I get the error

"1>c:\code\quiz\var.txt(512) : fatal error C1064: compiler limit : token overflowed internal buffer"

and 512*8 chars is the 4096 that JRL cited.

Upvotes: 5

JRL
JRL

Reputation: 78033

For Windows C++:

Only the first 2048 characters of Microsoft C++ identifiers are significant. Names for user-defined types are "decorated" by the compiler to preserve type information. The resultant name, including the type information, cannot be longer than 2048 characters.

Thus seems you could do a pretty simple test using a MS compiler, at least.

Edit: Didn't do extensive testing, but on my Visual Studio Pro 2008 at least, a variable named aaaa... (total length 4095 characters) compiles, and after that (>= 4096 you get Fatal Error C1064: compiler limit : token overflowed internal buffer).

Upvotes: 3

Mysticial
Mysticial

Reputation: 471569

I would assume that if it still works after the length reaches some ridiculous size (like > 1MB), that the compiler probably is able to handle arbitrary sized identifiers.

Of course there's no sure way to tell as it is entirely possible for the identifier length limit to exceed the amount of memory you have. (a limit of 2^32 - 1 is entirely possible)

Upvotes: 1

Related Questions