Reputation: 124265
Question inspired by Problem with casting a nested generic Set
Simplified version of example from linked question:
(please don't focus on purpose of below methods, there is none - except for demonstration of the problem)
//class Animal may have subclasses like Dog, Cat, etc..
public static <T extends Animal> void foo(){
List<List<Animal>> animalGroups = new ArrayList<>();
List<List<T>> list = (List<List<T>>) animalGroups; //ERROR about incompatible types
}
public static <T extends Animal> void bar(){
List<Animal> animals = new ArrayList<>();
List<T> list = (List<T>) animals; //WARNING about unchecked cast
}
In above examples:
foo
method we get compilation error at (List<List<T>>) animalGroups
,bar
method we get only warning at (List<T>) animals
.Just to be cleared, I am NOT asking why those examples can't compile without problems.
I understand that List<Dog> is NOT List<Animal>
so code like
List<Dog> dogs = animals;
can't be allowed to compile because it would break type-safety, as we could add Cat to list of animals
which means we would add that Cat to list of dogs
(since animals
and dogs
would refer to same lists).
Also explicitly casting like
List<Dog> dogs = (List<Dog>) animals;
doesn't "convince" compiler that we know what we are doing, since we still would have same vulnerability in type-safety - Cat can still be added via animals
to dogs
.
We get warning instead of error in case of (List<T>) animals
because there is a chance for such code to work safely. Specifically when <T extends Animal>
will represent Animal
type itself.
So if we call method bar like MyClass.<Animal>bar();
inside it we would end up in situation like
List<Animal> list = (List<Animal>) animals;
which is fine (casting would be redundant, but allowed).
If my assumption is correct, then why casting (List<List<T>>) animalGroups
gives us error instead of warning?
Same logic should be applied:
<T extends Animal>
can take as value Animal
type itselfList<List<Animal>> list = (List<List<Animal>>) animalGroups;
is fine.BUT since results are different for (List<T>) animals
and (List<List<T>>) animalGroups
then IMO either
T
in List<List<T>>
like it does for List<T>
(probably because it is part of inner generic type)(List<List<T>>) animalGroups
that doesn't exist in (List<T>) animals
and I missed it.Upvotes: 4
Views: 98
Reputation: 103273
This is java working intended, because that's what generics mean, as you evidently already understand (the wonky nature of variance means that allowing List<Animal> = listOfDogs;
lets you add cats to your list of dogs and that's no good - why generics are invariant in a nutshell), and, crucially, because generics are a figment of javac
's imagination!
Java did not ship with generics. Until java 1.5, generics just weren't a thing. At all. We just wrote:
/**
* Adopts a {@link com.foo.animals.Dog dog}; a dog will be taken from the kennel and added to your list of pets.
* @param list List of pets.
*/
void adoptDog(List pets) { ... }
i.e. the notion that pets
is to be treated as a list of animals is figured out based on documentation and context clues from the names of things alone, and by 'following the links' - i.e. realizing that Dog
is defined as extends Animal
, therefore one may assume that adopting a few dogs into a newly made list means any code that assumes that pets
contains only instances of Animal
will be fine.
java1.5 introduced generics, but generics are 100% an all-javac show - it is merely making it official: Compiler-checked documentation. Literally: The JVM spec (JVM Specification, i.e. java.exe
) has no idea what generics are. Javac eliminates most of it; the few places where it makes it into a class file are places java.exe completely ignores. I does what the spec says it should do with them: It knows how to read them (otherwise it wouldn't be able to even understand the class file at all), but it just skips right past this stuff.
When you write this:
class Foo<T extends Bar> {
public <Z super Foo> void foo() {}
}
The generics do make it into the class file, but the only reason they do, is because javac
can run based on a mix of source files and class files, and it needs to know the generics in those class files to do its magic.
A bit of code you should play around with:
You cannot make java compile this code:
void foo(String str) {
int foo = str;
}
as in, somehow attempt to assign the memory location/compressed ref that str
is under the hood / at the JVM level, and directly access it as an int
(or these days, on 64-bit arch, a long
perhaps). You can certainly try to manipulate your class file and edit bytecode to do it though:
NB: Not real bytecode, just serves to explain the idea
PUSH // push a string ref from the constant pool
POPI 1 // pop the top of the stack into an int-typed local var slot
But if you try this, the moment that class file is loaded, you would get a ClassVerifierError
. The JVM actually takes some effort (and the JVM Spec demands it do this) to ensure such shenanigans aren't in there. If they are, the entire class file itself is flat out rejected.
In contrast, if you try to pull the precise same stunt with generics, java.exe does not care whatsoever and will gladly run it. That's because javac
-itself- does this. There are no generics at runtime. This:
void foo(List<String> list) {
return list.get(0).toLowerCase();
is compiled to the exact same bytecode as:
void foo(List list) { // raw type - anything goes.
return ((String) list.get(0)).toLowerCase();
Try it! Run javap -c -v CompiledClassFile
to see the bytecode.
The only difference at the bytecode level between those two is one slight change: The fact that the param's type is List<String>
is stored in the classfile; not that the JVM has any clue as to what that might mean. But javac
does, and will take it into account if javac attempts to compile some code that calls this foo
method.
So why is this completely fine and not a security leak, whereas the attempt to 'pop' a string ref into an int variable is so egregiously bad the JVM takes the time to scan the class bytecode, find this, and reject the entire file?
Because javac
--injected-- that cast, and casts are things the JVM itself (java.exe
) knows about, rigorously applies, and this therefore prevents core dumps or other sorts of severe heap corruption; if you finagle your way to call this method with a list containing a non-string, you simply get a ClassCastException
. Let's try it!
List list = new ArrayList(); // raw warning, we'll ignore it.
list.add(Integer.valueOf(5)); // not a string!
foo(list);
The above code will throw a ClassCastException. Weirdly, it is thrown on a line that does not contain any casts at all. That's because javac inserted it. javac
will compile it, and java.exe
's verifier will be totally fine with it.
Javac also applies, purely as a compiler move, a check; if you attempt to invoke the foo
method and pass it a List<Integer>
for example, javac
itself will refuse to compile it and emit an error. But that is all - javac simply refuses. It could produce legal bytecode that a JVM will accept and run. If the list was empty, you'd never know it was broken. If it's not empty, well, that ClassCastException occurs which isn't a security or heap corruption issue. Exceptions are 'safe', they don't cause core dumps, memory corruption, potential avenues for buffer overruns, etcetera. If you patch javac to not complain, then the byte code produced would be fine (likely to throw an exception if you invoke it, but that's fine).
is that compiler allows casting in case where there is a chance for such code to work safely.
As far as generics is concerned, correct, but doesn't explain what you're witnessing here. That's just as a convenience to you. Why let you write code that makes no sense? That's just enabling you to write bugs.
`(List<List> someListOfListsOfAnimals) // compiler error
Yeah, that's a weird one, isn't it? You can always force the issue. This compiles:
List<List<Animal>> a = new ArrayList<>();
List /* raw */ temp = a;
List<List<T>> b = (List<List<T>>) temp;
We can even combine those casts into this monster:
List<List<T>> a = (List<List<T>>) listOfListsOfAnimals; // error!
List<List<T>> a = (List<List<T>>) (List) listOfListsOfAnimals; // warning!
That second line, whilst looking pretty stupid, actually 'works' (compiles; it's still a warning; anytime you tell javac to buzz off about generics you get a warning because javac is the first and last line of defense on these things; if javac is told not to care about it, nothing will, hence, javac tells you: Ooookay mate, you're on your own then, here, a warning to make doubly sure you fully understand your promises on this stuff is the only thing stopping some very bizarro bugs from showing up in your code base!).
So, why? It's mostly 'because the spec says so'. "Why does the spec say so"? Because.. it does, at some point you're asking: What were the authors of that spec thinking about when they wrote it, which, given that they didn't maintain an official diary when they did, is unanswerable except possibly by the authors themselves, which do not, as far as I know, frequent Stack Overflow.
One can guess, perhaps. With 'raw' it is allowed because 'raw' mode allows anything, that's sort of the point of raw mode. Looking solely at what's in the generics, you're trying to cast List<Animal>
to List<T>
and those types have no relationship at all (because generics are invariant, hence, these are as different from each other as Integer
and String
are - neither is a supertype of the other). The spec simply says that casting siblings (unrelated types) to each other isn't allowed. So, for the same line of spec that dictates this:
Integer i = ...;
String s = (Integer) i;
is to be considered a compiler error because it's non-sensical, now your cast is also flagged down as a straight error even though here it's less clear, given that even though the types are nominally 100% unrelated, it sure feels utterly bizarre that List<List<T>>
where T is defined as T extends Animal
, and List<List<Animal>>
are completely unrelated and therefore one cannot be cast to the other like this.
Upvotes: 2