Hobowpen
Hobowpen

Reputation: 71

Generics in Java

I am having trouble to fully understand generics in Java. I get the initial idea but I don't get it when it comes to more complicated implementations. For example, in this code :

public class TokenCounterMapper 
     extends Mapper<Object, Text, Text, IntWritable>{

   private final static IntWritable one = new IntWritable(1);
   private Text word = new Text();

   public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
     StringTokenizer itr = new StringTokenizer(value.toString());
     while (itr.hasMoreTokens()) {
       word.set(itr.nextToken());
       context.write(word, one);
     }
   }
 }

Why do we have 4 generics parameters (<Object, Text, Text, IntWritable>)? What does they mean?

Upvotes: 0

Views: 147

Answers (2)

Makoto
Makoto

Reputation: 106508

There's nothing terribly complicated about this, but I'm not a fan of this example, so let's use another.

Suppose we wanted to create a data structure that could hold any object. For that, we have an implementation of List:

public interface List {
    boolean add(Object data);
}

This does exactly what I want. Well, sort of.

What happens when I do this? What should happen?

public static void main(String[] args) {
    List list = new ArrayList();

    list.add(10);
    list.add("20");
    list.add(30.0);

    for(int i = 0; i < list.size(); i++) {
        int item = (int) list.get(i);
        System.out.println(item);
    }
}

This is actually valid Java syntax, and will compile just fine. The issue here is that I will get an error at runtime - I can't cast a String to an int.

This is actually really bad - if I can't ensure type safety at compile time, then it could be quite a while before I understand the kind of casting error I've made. Worse, the issue may not manifest itself for a very long time.

Generics on a class allow you to specify a specific type to be bound to an instance of that class. It also allows you to reuse code in the sense that you don't have to create a specific List implementation for String, Integer, Double, Float, and so forth.

Let's tweak this interface a bit:

public interface List<T> {
    boolean add(T data);
}

The resultant code is all good news (and no, it shouldn't compile):

public static void main(String[] args) {
    List<Integer> list = new ArrayList<>();

    list.add(10);
    list.add("20");
    list.add(30.0);

    for(int i = 0; i < list.size(); i++) {
        int item = (int) list.get(i);
        System.out.println(item);
    }

}

First, I can get rid of the cast, which was the problem to begin with. Since I've type bound the entire list to Integer, I don't need to worry about casting it to an int. Second, the error of my ways is made apparent to me at compile time - I'm adding stuff that isn't an Integer to my list! That's most definitely a bug, and it's most definitely quick to fix.


Generics can compound themselves even further than this, which is why you often see generics going as high as 3 or 4 type parameters. (If you have any more than that, you should really question if the class is doing too much.) Their purpose is twofold:

  • Ensure that the programmer doesn't have to do unnecessary casts to get back data that they put in.

  • Ensure that any errors in inserting incorrect data into the datatype or method is caught as early as possible - at compile time.

There's an argument about readability, and in my experience, it really depends on what the class is responsible for, as well as the generics (and its bounds).

As a lighter, slightly more intuitive example, explore the Guava Tables API. It's all row-key-value triplets, with the idea being that you have some object to represent a row value, some object to represent a key value, and some object that you actually care to store in the table.

Upvotes: 1

lxcky
lxcky

Reputation: 1668

So that your map() method knows what type of parameter it will receive.

Upvotes: 0

Related Questions