Bohdan
Bohdan

Reputation: 17211

Why variable names in Java cannot have same names as keywords?

In most programing languages that I know you cannot declare a variable with name that is also a key word.

For example in Java:

public class SomeClass
{
    Class<?> clazz = Integer.class; // OK.
    Class<?> class = Integer.class; // Compilation error.
}

But it's very easy to figure out what is what. Humans reading it will not confuse variable name with class declaration and compiler will most likely not confuse it too.

Same thing about variable names like 'for', 'extends', 'goto' or anything from Java key words if we are talking about Java programming language.

What is the reason that we have this limitation?

Upvotes: -4

Views: 997

Answers (3)

david.pfx
david.pfx

Reputation: 10863

You're not quite right there. A key word is a word that has meaning in the syntax of the language, and a reserved word is one that you're not allowed to use as an identifier. In Java mostly they are the same, but 'true' and 'goto' are reserved words and not key words ('true' is a literal and 'goto' is not used).

The main reason to make the key words in a language reserved words is to simplify parsing and avoid ambiguities. For example, what does this mean if return could be a method?

return(1);

In my opinion, Java has taken this too far. There are key words that are only meaningful in a particular context in which there could be no ambiguity. Perhaps there is benefit in avoiding confusion on the part of the reader, but I put it down to customary habit of compiler writers. There are other languages which have far fewer key words and/or reserved words and work just fine.

Upvotes: 1

user2864740
user2864740

Reputation: 61975

It allows the lexer to classify symbols without having to disambiguate context - this in turn allows the language to be parsed according to grammar rules without needing knowledge about other ("higher") parts of the compilation process, including analysis of types.

As an example of complications (and ambiguity) removing such a distinction adds to parsing, consider the following. Under standard Java rules it declares and assigns a variable - there is no ambiguity of how it will be parsed.

final Foo x = 2;   // roughly: <keyword> <identifier> <identifier> = <value>

Now, in a hypothetical language without a strict keyword distinction, imagine the following, where final may be a declared type; there are now two possible readings. The first is when final is not a type and the standard reading exists:

final Foo = 2;     // roughly: <keyword> <identifier> ?error? = <value>

But if final was a "final type", then the reading may be:

final Foo = 2;     // hypothetical: <identifier> <identifier> = <value>

Which interpretation of the source is correct?

Java makes this question even harder to answer due to separate compilation. Should adding a new "final type" in (or accidentally importing) a namespace now change how the code is parsed? Reporting an unresolved symbol is one thing - changing how the grammar is parsed based on such resolution is another.

These sort of issues are simply bypassed with the clear distinction of reserved words.


Arguably, there could be special productions to change the recognition of keywords dynamically (some languages allow controllable operator precedence), but this is not done in mainstream languages and is most certainly not supported in Java. At the very least it requires additional syntax and adds complexity to the system for not-enough benefit.

The most "clean" approach I've seen to such a problem is in C#, which allows one prefix reserved words and remove special meaning such as class @class { float @int = 2; } - although such should be done rarely, and ick!

Now, some words in Java that are reserved could be "reserved only in context", such as extends. Such is seen in SQL all the time; there are reserved words (eg. OVER) and then words that only have special meaning in a given statement construct (eg. ROW_NUMBER). But it's easier to say reserved is reserved, go pick something else.

Except for a very simple-to-parse language like LISP dialects, which effectively treat every bareword as an identifier, keywords and the distinction from identifiers is very prevalent in language grammars.

Upvotes: 1

Stephen C
Stephen C

Reputation: 719376

What is the reason that we have this limitation?

There are two reasons in general:

  • As you identified in your Question: it would be extremely confusing for the human reader. And a programming language that is confusing by design is not going to get significant traction as a practical programming language.

  • If identifiers can be the same as keywords, it makes it much more difficult to write a formal grammar for the language. (Certainly, a grammar like that with the rules for disambiguation cannot be expressed in BNF / EBNF or similar.) That means that writing a parser for such a language would be a lot more complicated.

Anyhow, while neither of these reasons is a total "show stopper", they would be sufficient to cause most people attempting a new programming language design / implementation to reject the idea.

And that of course is the real reason that you (almost) never see languages where keywords can be used as identifiers. Programming language designers nearly always reject the idea ...

(In the case of Java, there was a conscious effort to make the syntax accessible to people used to the C language. C doesn't support this. That would have been a 3rd reason ... if they were looking for one.)


There is one interesting (semi-) counter example in a mainstream programming language. In early versions of FORTRAN, spaces in identifiers were not significant. Thus

    I J = 1

and

    IJ = 1

meant the same thing. That is cool (depending on your "taste" ...). But compare these two:

    DO 20 I = 10, 1, -2

versus

    DO 20 I = 10

One is an assignment, but the other one is a "DO loop" statement. As a reader, would you notice this?

Upvotes: 2

Related Questions