Reputation: 71
I am having trouble to fully understand generics in Java. I get the initial idea but I don't get it when it comes to more complicated implementations. For example, in this code :
public class TokenCounterMapper
extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
Why do we have 4 generics parameters (<Object, Text, Text, IntWritable>
)?
What does they mean?
Upvotes: 0
Views: 147
Reputation: 106508
There's nothing terribly complicated about this, but I'm not a fan of this example, so let's use another.
Suppose we wanted to create a data structure that could hold any object. For that, we have an implementation of List
:
public interface List {
boolean add(Object data);
}
This does exactly what I want. Well, sort of.
What happens when I do this? What should happen?
public static void main(String[] args) {
List list = new ArrayList();
list.add(10);
list.add("20");
list.add(30.0);
for(int i = 0; i < list.size(); i++) {
int item = (int) list.get(i);
System.out.println(item);
}
}
This is actually valid Java syntax, and will compile just fine. The issue here is that I will get an error at runtime - I can't cast a String
to an int
.
This is actually really bad - if I can't ensure type safety at compile time, then it could be quite a while before I understand the kind of casting error I've made. Worse, the issue may not manifest itself for a very long time.
Generics on a class allow you to specify a specific type to be bound to an instance of that class. It also allows you to reuse code in the sense that you don't have to create a specific List
implementation for String
, Integer
, Double
, Float
, and so forth.
Let's tweak this interface a bit:
public interface List<T> {
boolean add(T data);
}
The resultant code is all good news (and no, it shouldn't compile):
public static void main(String[] args) {
List<Integer> list = new ArrayList<>();
list.add(10);
list.add("20");
list.add(30.0);
for(int i = 0; i < list.size(); i++) {
int item = (int) list.get(i);
System.out.println(item);
}
}
First, I can get rid of the cast, which was the problem to begin with. Since I've type bound the entire list to Integer
, I don't need to worry about casting it to an int
. Second, the error of my ways is made apparent to me at compile time - I'm adding stuff that isn't an Integer
to my list! That's most definitely a bug, and it's most definitely quick to fix.
Generics can compound themselves even further than this, which is why you often see generics going as high as 3 or 4 type parameters. (If you have any more than that, you should really question if the class is doing too much.) Their purpose is twofold:
Ensure that the programmer doesn't have to do unnecessary casts to get back data that they put in.
Ensure that any errors in inserting incorrect data into the datatype or method is caught as early as possible - at compile time.
There's an argument about readability, and in my experience, it really depends on what the class is responsible for, as well as the generics (and its bounds).
As a lighter, slightly more intuitive example, explore the Guava Tables API. It's all row-key-value triplets, with the idea being that you have some object to represent a row value, some object to represent a key value, and some object that you actually care to store in the table.
Upvotes: 1
Reputation: 1668
So that your map()
method knows what type of parameter it will receive.
Upvotes: 0