Spikatrix
Spikatrix

Reputation: 20244

Why does GCC emit a warning when using trigraphs, but not when using digraphs?

Code:

#include <stdio.h>

int main(void)
{
  ??< puts("Hello Folks!"); ??>
}

The above program, when compiled with GCC 4.8.1 with -Wall and -std=c11, gives the following warning:

source_file.c: In function ‘main’:
source_file.c:8:5: warning: trigraph ??< converted to { [-Wtrigraphs]
     ??< puts("Hello Folks!"); ??>
 ^
source_file.c:8:30: warning: trigraph ??> converted to } [-Wtrigraphs]

But when I change the body of main to:

<% puts("Hello Folks!"); %>

no warnings are thrown.

So, Why does the compiler warn me when using trigraphs, but not when using digraphs?

Upvotes: 7

Views: 2897

Answers (4)

supercat
supercat

Reputation: 81115

Trigraphs are nasty because they use character sequences which could legally appear within valid code. A common case which used to cause compiler errors on code for classic Macintosh:

unsigned int signature = '????';  /* Should be value 0x3F3F3F3F */

Trigraph processing would would turn that into:

unsigned int signature = '??^;  /* Should be value 0x3F3F3F3F */

which would of course not compile. In some slightly rarer cases, it would be possible for such processing to yield code which would compile, but with different meaning from what was intended, e.g.

char *template = "????/1234";

which would get turned into

char *template = "??S4"; // ??/ becomes \, and \123 becomes S

Not the string literal that was intended, but still perfectly legitimate nonetheless.

By contrast, digraphs are relatively benign because outside of some possible weird corner cases involving macros, no code containing processable digraphs would have a legitimate meaning in the absence of such processing.

Upvotes: 3

Shafik Yaghmour
Shafik Yaghmour

Reputation: 158449

This gcc document on pre-processing gives a pretty good rationale for a warning (emphasis mine):

Trigraphs are not popular and many compilers implement them incorrectly. Portable code should not rely on trigraphs being either converted or ignored. With -Wtrigraphs GCC will warn you when a trigraph may change the meaning of your program if it were converted.

and in this gcc document on Tokenization explains digraphs unlike trigraphs do not potential negative side effects (emphasis mine):

There are also six digraphs, which the C++ standard calls alternative tokens, which are merely alternate ways to spell other punctuators. This is a second attempt to work around missing punctuation in obsolete systems. It has no negative side effects, unlike trigraphs,

Upvotes: 5

Jens
Jens

Reputation: 72619

Because trigraphs have the undesirable effect of silently changing code. This means that the same source file is valid both with and without trigraph replacement, but leads to different code. This is especially problematic in string literals, like "<em>What??</em>".

Language design and language evolution should strive to avoid silent changes. Having the compiler warn about trigraphs is a good thing to have.

Contrast this with digraphs, which were new tokens that do not lead to silent changes.

Upvotes: 6

taliezin
taliezin

Reputation: 916

May be because it has no negative side effects, unlike trigraphs as is stated in gcc documentation:

Punctuators are all the usual bits of punctuation which are meaningful to C and C++. All but three of the punctuation characters in ASCII are C punctuators. The exceptions are ‘@’, ‘$’, and ‘`’. In addition, all the two- and three-character operators are punctuators. There are also six digraphs, which the C++ standard calls alternative tokens, which are merely alternate ways to spell other punctuators. This is a second attempt to work around missing punctuation in obsolete systems. It has no negative side effects, unlike trigraphs, but does not cover as much ground. The digraphs and their corresponding normal punctuators are:

 Digraph:        <%  %>  <:  :>  %:  %:%:
 Punctuator:      {   }   [   ]   #    ##

Upvotes: 4

Related Questions