Sonu Patidar
Sonu Patidar

Reputation: 801

Hadoop Text Comparison not working

Below is the code for a Hadoop Reducer, I am not able to understand why the comparison(placed between slashes) always failing, here we are comparing two Text type values. This code is for a Reducer doing Inverted Indexing.

 public static class IntSumReducer
       extends Reducer<TextPair, Text, Text, Text>{

    private Text indexedData = new Text();

    public void reduce(TextPair key, Iterable<Text> values, Context context)
           throws IOException, InterruptedException {

        Iterator<Text>  itr = values.iterator();
        Text oldValue = itr.next() ;
        String old = oldValue.toString();

        //String next;
        int freq = 1;
        Text nextValue = null;
        StringBuilder stringBuilder = new StringBuilder();

        if(itr.hasNext()==false) {
            stringBuilder.append(old + 1);
        }

        while(itr.hasNext()) {
            nextValue = itr.next();         
            int compareValue = oldValue.compareTo(nextValue);

            while(compareValue == 0) {
                freq++;

                if(itr.hasNext()) {
                    nextValue = itr.next();

                   ////////////////////////////
                   // following comparison always returning zero
                   // Although values are changing
                   compareValue = oldValue.compareTo(nextValue);
                   ///////////////////////////

                    System.out.println(compareValue);

                } else {
                    freq++;
                    System.out.println("Break due to data loss..");
                    break;
                }               
            }//end while
            System.out.println("Value Changed..");
            old = old + freq;
            stringBuilder.append(old);
            stringBuilder.append(" | ");
            oldValue = nextValue;
            old = nextValue.toString();
            freq = 1;

        }//endwhile

        //System.out.println("KEY :: " + key.toString());   
        context.write(key.getFirst(),new Text(stringBuilder.toString()));
    }   
}

Any help is appreciated as I am completely new to this area.

Upvotes: 3

Views: 1194

Answers (1)

Binary Nerd
Binary Nerd

Reputation: 13937

Your problem is most likely related to the fact that the Iterable<Text> is reusing the Text objects, so it doesnt give you a new object each time, it just reuses the same object.

At a minimum you need to change these two lines:

Text oldValue = itr.next();
oldValue = nextValue;

To:

Text oldValue = new Text(itr.next());
oldValue.set(nextValue);

Otherwise you're just comparing the same object because oldValue will always point at the object you're comparing it too.

Upvotes: 2

Related Questions