jojoba
jojoba

Reputation: 554

Hadoop: Array of primitives as value in a key value pair

I have asked a very resemblant question in a previous thread Hadoop: How can i have an array of doubles as a value in a key-value pair? .

My problem is that i want to pass a double array as value from map to reduce phase. The answer i got was to serialize, convert to Text, pass it to the reducer and deserialize. This is a fine solution but its like serializing and deserializing it twice.

ArrayWritable only accepts types that implement Writable like FloatWritable for example. So another solution is to convert my array of doubles to an array of DoubleWritables. But this requires some time too and Writables are a very expensive resource. Isn't there a very simple solution like ArrayWritable array=new ArrayWritable(Double.class) ???

Upvotes: 5

Views: 5446

Answers (2)

szhem
szhem

Reputation: 4712

Just implement your own Writable interface.

For example,

public class DoubleArrayWritable implements Writable {
    private double[] data;

    public DoubleArrayWritable() {

    }

    public DoubleArrayWritable(double[] data) {
        this.data = data;
    }

    public double[] getData() {
        return data;
    }

    public void setData(double[] data) {
        this.data = data;
    }

    public void write(DataOutput out) throws IOException {
        int length = 0;
        if(data != null) {
            length = data.length;
        }

        out.writeInt(length);

        for(int i = 0; i < length; i++) {
            out.writeDouble(data[i]);
        }
    }

    public void readFields(DataInput in) throws IOException {
        int length = in.readInt();

        data = new double[length];

        for(int i = 0; i < length; i++) {
            data[i] = in.readDouble();
        }
    }
}

Upvotes: 8

Bohemian
Bohemian

Reputation: 424993

You can specify double[] as the value type for a Map:

Map<String, double[]> map = new HashMap<String, double[]>(); // compiles

Java arrays are automatically Serializable if the element type is Serializable, and primitives are all Serializable.

Upvotes: 0

Related Questions