vxs8122
vxs8122

Reputation: 844

Need clarification on the definition of C tokens

From the K&R's "The C Programming Language" book:

There are six classes of tokens: identifiers, keywords, constants, string literals, operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments as described below (collectively, "white space") are ignored except as they separate tokens.

What does it mean by "other separators"?

Suppose given a statement:

result = (4 * b - a * b) / 3;

So by the definition, result, a, and b should be identifiers, and =, (, ), *, /, and -should be operators. What about the semicolon, ;? Is it considered a token and if so, what category does it fall into?

Also, as for white spaces, are they considered the "other separators"?

Upvotes: 4

Views: 2119

Answers (3)

haccks
haccks

Reputation: 106102

What are separators?
Anything that can be used to separate tokens. For examples , in

int a, b, c;

An operator can also act as a separator

a = b*c;

* is an arithmetic operator as well as it is a separator. It will separate the two identifiers b and c in tokenization.

What about the semicolon, ;? Is it considered a token and if so, what category does it fall into?

; is also a separator. It separates one statement from another and hence tokens.

Upvotes: 3

interjay
interjay

Reputation: 110174

This distinction between operators and other separators is something that existed in old versions of C, but has been removed.

The C89 standard gives this listing of operators and punctuators (punctuators being what K&R calls "other separators"):

operator: one of
        [  ]  (  )  .  ->
        ++  --  &  *  +  -  ~  !  sizeof
        /  %  <<  >>  <  >  <=  >=  ==  !=  ^  |  &&  ||
        ?  :
        =  *=  /=  %=  +=  -=  <<=  >>=  &=  ^=  |=
        ,  #  ##

punctuator: one of
        [  ]  (  )  {  }  *  ,  :  =  ;  ...  #

An operator is defined as something that specifies an operation to be performed, while a punctuator has syntactic significance but does not specify an operation that yields a value.

Note that ( ) [ ] are considered both operators (when used in an expression) or punctuators (e.g. in a function or array declaration).

The C99 standard removes this unnecessary distinction and calls all these symbols "punctuators".

Regarding white-space, it is not considered a token, so is not an operator or punctuator.

Upvotes: 3

Lundin
Lundin

Reputation: 214730

That book is ancient. The C standard uses different terms/groups nowadays. C11 Annex A.1.1:

(6.4) token:
  keyword
  identifier
  constant
  string-literal
  punctuator

Details about the above are found in chapter 6.4. Though if you continue to read the very same (mildly interesting) annex, you'll see this:

A.1.7 Punctuators
(6.4.6) punctuator: one of
  [ ] ( ) { } . ->
  ++ -- & * + - ~ !
  / % << >> < > <= >= == != ^ | && ||
  ? : ; ...
  = *= /= %= += -= <<= >>= &= ^= |=
  , # ##
  <: :> <% %> %: %:%:

If you are interested in things like these (they are far from essential knowledge even for a veteran C programmer, unless you are making a compiler), I would suggest that you download a draft version of the standard and read annex A.

Upvotes: 1

Related Questions