sunzhong
sunzhong

Reputation: 43

Java Generics Declaration Placement

Q1:Why are generic declarations for classes placed after the class name, and not before?

class Point<T>{

}

Q2:Why are generic declarations for methods placed before the method name, but after the public access modifier and before the return type?

private static <T extends Number> T func(T a) {

}

I expect an explanation of this issue from the perspective of the compilation process or design reasons.

I've tried look for answer from jdk doc, but did not find anything useful.

Upvotes: 0

Views: 66

Answers (2)

rzwitserloot
rzwitserloot

Reputation: 103813

Two concerns underlie the choice of syntax for generics:

Backwards compatibility

Generics were added later on; like any language feature added after release v1.0, that means the addition of generics cannot change what's already there. After all, if it did, existing java code no longer compiles on newer JDK compilers and the OpenJDK team doesn't want that. The aim is that the vast, vast majority of java users can update their JDK and everything just keeps working; no need to recompile your code. But if you want to recompile your code – go ahead, you can do that on a newer version of the compiler and your code will still compile, and, the result of that compilation does the same thing as compiling your code on the older javac you wrote it for. That rule is broken from time to time (it isn't a guarantee!) but the principle is: That is a serious downside to any language proposal and therefore such an extraordinary cost should be balanced out by an extraordinary benefit, and one that couldn't have been deliver in any other way.

Also, they want their compiler to remain possible on a LL(k) parsing model. In other words, java needs to be expressible as a context-free grammar.

Hence:

  • Code without generics has to [A] be parsable, and [B] still 'work'. This is why leaving generics off entirely results in warnings but otherwise maximum permissive code (you can for example .add whatever you want to a raw List type because that's how java worked before generics, and 'raw' is shorthand for "I want it to be like how java worked before it had generics". It has to be - all existing java code the day before generics was released, was written for "java as it worked before generics", after all).

  • Code without the generics must mean the same thing as it did before the feature was introduced.

  • Code with generics has to remain a context-free grammar; the parser has to be able to parse it with an LL(k) style parser.

  • Somebody somewhere decided that preferably type parameter stuff should follow types. In other words, if it doesn't matter, List<T> is better than <T>List. We have to dip into style arguments here, but that makes sense to me and seems to match how folks talked and thought about parameterized types before the introduction of generics. Point it, like most 'style fights', best thing to do is to have somebody pick something and everybody else to follow: Even if you prefer style A over style B, everybody prefers (or should prefer) either style over 'everybody uses whatever style they like, codebases mix styles all the time, various competing style guides suggest different styles in different situations.. oh an the lang spec is much more complex now as it needs to always be capable of catering to both styles'. In other words, there is no explanation available or needed for why the java language spec doesn't let you declare it however you want: Having multiple ways to accomplish the same thing is inherently bad. The 'burden of proof' is on having multiple ways, not the other way around.

Declaring a type parameter vs. using one

The <T> in public class MyType<T> and the <T> in void foo(MyType<T> paramName) are completely different. One declares a type parameter, the other uses it.

The syntax is different; upon declaring a type parameter, the only allowed thing inside the <> is a type parameter name, and optionally a lower bound, but you can specify multiple bounds if you want. So, you can write class MyType<T>, or you can write class MyType<T extends Serializable & Collection>. But that's all you can do.

Whereas at use site, the stuff in the parens can be an actual type (List<String>), a type parameter (class Foo<T> { List<T> fieldName; }), or even a wildcard with or without bounds (List<? super Number>) - but that List<? extends A & B> syntax isn't legal.

With method declarations it gets a bit tricky, because you can do both! Method declarations have a return type, and that's a type, so, it could use type parameters. But methods are also an execution context, so, methods can declare them too. The parser needs to understand which is which.

Examples of alternative syntax and why they do not work

Type param decl follows return type

In other words, instead of this, currently valid java code:

private static <T extends Number> List<String> func(T a) {

}

the langspec would have you write this instead:

private static List<String> <T extends Number> func(T a) {

}

This cannot work, as it is ambiguous. That is, it's not ambiguous if the return type has generics (adds some type parameter usage), but it is ambiguous if it does not. Given this:

private static List<T> func() {}

Keeping in mind that during parsing, the parser cannot determine what List actually is; it cannot know that java.util.List has been declared as taking a type param, is that:

  • List is a raw type or List doesn't have generics at all, and the <T> is a type parameter declaration, -or-

  • The <T> is type parameter use, and is parameterizing the List type?

Given that it's ambiguous, this style is not possible.

Type param decl precedes method name

Given that return type immediately precedes method name, this boils down to the exact same situation as the previous chapter: That results in ambiguity and therefore isn't possible.

One obvious way out is to mess with whitespace, i.e. say that as is customary today, no whitespace must exist between a type and its parameterisation whereas the whitespace is mandatory when declaring type params. In other words:

SomeType <T> foo() { .. } // SomeType has no params, <T> declares.
SomeType<T> foo() { .. } // SomeType is parameterized, method isn't.

But that's just not how the java spec works at all. The language defines what tokens are and defines that whitespace in between tokens is irrelevant: Any word-based tokens must obviously have whitespace to separate them (publicstatic is not the same as public static), but other than splitting word-based tokens, whitespace has no meaning whatsoever in the language spec. You'd have to redesign java from the ground up to introduce this feature, and many tools, for example IDEs, also need to rewrite their parsers pretty much from scratch. That's... an option but an extremely high cost, hence, the benefit has to be at least as high to make paying the cost a worthwhile endeavour. It isn't (a judgement call, but an obvious one), so, this syntax is effectively impossible.

Type param decl comes at the end

class Example {
  static List<T> makeNumbers(IntFunction<T> maker)
    throws IOException <T extends Number> {}
}

This is actually possible syntax, but, it's bizarre: Return types in a method can (and often do!) use any type parameters declared by the method, as is the case in the example above. However, the T in List<T> comes as a complete surprise; there is no T declared (yet) when you're reading top-to-bottom, left-to-right, as java code is designed to be read. The declaration comes at the very end. I find it inherently obvious that the above suggestion is a really bad idea. It's a style thing - there is no objective proof to be had. The above syntax could work, but in competition with the syntax we have today it seems like a no-brainer: This syntax is much worse.

Theoretically there is a much more pragmatic reason the above is not possible: In java you can legally put the array brackets of your type after your method declaration. This is legal java and always has been:

public int thisMethodReturnsAnIntArray() [] {
  return new int[10];
}

Note the brackets at the end there. Hence, if the generics go at the end, and you use this exotic feature nobody ever uses, then it gets very confusing and might not be context-free parsable, depending on how things are set up.

There is an easy fix available - just ban that, nobody uses this style so you can probably 'get away' with doing that. The JDK never makes an absolute guarantee that things remain backwards compatible, and this would be the kind of backwards incompatible break that's plausible to just do.

Type param between method name and argslist

class Example {
  static List<T> makeNumbers<T extends Number>(IntFunction<T> maker)
    throws IOException {}
}

This is possible syntax, it's not even that bad. It still suffers, but less so, from the use-before-write issue discussed in the previous chapter.

But is it obviously superior to the syntax we have today? Given that it suffers from use-before-write I think we can clearly conclude that it is not - it is merely one of a number of alternatives. Which gets us back to: A choice had to be made, and one was made.

Upvotes: 3

Joop Eggen
Joop Eggen

Reputation: 109613

For a class declaration it allows the generic class parameter to be specified with the class being declared.

This is done in the Enum base class (for enum ) for instance. Not the nicest solution for a type system.

public class Xyz <T extends Xyz> { ... }

public abstract class Enum<E extends Enum<E>>

For a method name it is clear, so the result type can be the generic class, hence the generic class must be declared in front. Doing it at the last possible moment (after the modifiers) allows the syntax/grammar of the member declaration just to be extended, but it probably is just a matter to keep <...> at least near the declared name, before or after.

    public static final <T> T f() { ... }

The conclusion is: you are right, two different designs, but for a reason.

Upvotes: 0

Related Questions