kshitiz ghimire
kshitiz ghimire

Reputation: 61

Why can't we use of hyphen while declaring structure variable name?

struct birthday{ int day:6; }b-day;

While declaring b-day as a structure it shows the following error:

error: expected ':', ',', ';', '}' or '__attribute__' before '-' token|

but after removing the hyphen from the variable name it works, why?

Upvotes: 1

Views: 6430

Answers (4)

John Bode
John Bode

Reputation: 123598

The boring answer is that the language definition doesn't allow - to be part of an identifier (variable name, function name, typedef name, enumeration constant, tag name, etc.).

Why that's the case probably boils down to a couple of things:

At the preprocessing stage, your source text is broken up into a sequence of tokens - identifiers, punctuators, string literals, and numeric constants. Whitespace is not significant except that it separates tokens of the same type. If you write a=b+c;, the compiler sees the sequence of tokens identifier (a), punctuator (=), identifier (b), punctuator (+), identifier (c), and punctuator (;). This is before it does any syntax analysis - it's not looking at the meaning or the structure of that statement, it's just breaking it down into its component parts.

It can do this because the characters = and + and ; can never be part of an identifier, so it can clearly see where identifiers begin and end1.

The tokenizer is "greedy" and will build the longest valid token it can. In a declaration like

int a;

you need the whitespace to tell the preprocessor that int and a are separate tokens, otherwise it will try to mash them together into a single token inta. Similarly, in a statement like a=b- -c;, you need that whitespace (or parentheses, a=b-(-c);) to signify you're subtracting -c from b, otherwise the tokenizer will interpret it as a = b-- c, which isn't what you want.

So, if a - could be part of an identifier, how should x=a-b+c be tokenized? Is a-b a single token or three? How would you write your tokenizer such that it could keep track of that? Would you require whitespace before and after - to signify that it's an operator and not part of a variable?

It's certainly possible to define a language that allows - to be both an operator and part of an identifier (see COBOL), but it adds complexity to the tokenizing stage of compiling, and it's just plain easier to not allow it.


  1. Coincidentally, this is why there's no difference between T *p; and T* p; when declaring pointer variables - the * can never be part of an identifier, so whitespace isn't necessary to separate the type from the variable name. You could write it as T*p; or even T * p; and it will be treated exactly the same.

Upvotes: 1

Vlad from Moscow
Vlad from Moscow

Reputation: 311146

It is because the symbol '-' may not be used in an identifier name. When it is used between a sequence of symbols it is considered by the compiler as the binary or unary minus operator depending on the context.

The error message

error: expected ':', ',', ';', '}' or 'attribute' before '-' token|

means that the compiler tries to interpret the declaration at least like

    struct birthday{
        int day:6;
    }b; -day;

You could declare the structure like

    struct birthday{
        int day:6;
    } b_day;

that is using the underscore symbol instead of the hyphen symbol.

Upvotes: 0

MikeCAT
MikeCAT

Reputation: 75062

Because C doesn't allow to use hyphens for identifier names.

Basically you can only use alphabets, digits, and an underscore. Also using digits as the first character is not allowed.

Quote from N1570 6.4.2 Identifiers:

Syntax

identifier:
    identifier-nondigit
    identifier identifier-nondigit
    identifier digit

identifier-nondigit:
    nondigit
    universal-character-name
    other implementation-defined characters

nondigit: one of
    _ a b c d e f g h i j k l m
      n o p q r s t u v w x y z
      A B C D E F G H I J K L M
      N O P Q R S T U V W X Y Z

digit: one of
    0 1 2 3 4 5 6 7 8 9

Upvotes: 2

Eric Postpischil
Eric Postpischil

Reputation: 224546

Hyphens are used as subtraction and negation operators, so they cannot be used in variable names. (Whether the variable is for a structure or another type is irrelevant.)

If you had:

int a = 1;
int b = 2;
int a-b = 3;
printf("%d\n", a-b);

then we would have ambiguity about whether to print “-1” for a minus b or to print “3” for the variable a-b.

Upvotes: 5

Related Questions