Artem Oboturov
Artem Oboturov

Reputation: 4385

Enum value implementing Writable interface of Hadoop

Suppose I have an enumeration:

public enum SomeEnumType implements Writable {
  A(0), B(1);

  private int value;

  private SomeEnumType(int value) {
    this.value = value;
  }

  @Override
  public void write(final DataOutput dataOutput) throws IOException {
    dataOutput.writeInt(this.value);
  }

  @Override
  public void readFields(final DataInput dataInput) throws IOException {
    this.value = dataInput.readInt();
  }
}

I want to pass an instance of it as a part of some other class instance.

The equals would not work, because it will not consider the inner variable of enumeration, plus all enum instances are fixed at compile time and could not be created elsewhere.

Does it mean I could not send enums over the wire in Hadoop or there's a solution?

Upvotes: 5

Views: 2286

Answers (3)

aaronman
aaronman

Reputation: 18750

WritableUtils has convenience methods that make this easier.

WritableUtils.writeEnum(dataOutput,enumData);
enumData = WritableUtils.readEnum(dataInput,MyEnum.class);

Upvotes: 1

Thomas Jungblut
Thomas Jungblut

Reputation: 20969

My normal and preferred solution for enums in Hadoop is serializing the enums through their ordinal value.

public class EnumWritable implements Writable {

    static enum EnumName {
        ENUM_1, ENUM_2, ENUM_3
    }

    private int enumOrdinal;

    // never forget your default constructor in Hadoop Writables
    public EnumWritable() {
    }

    public EnumWritable(Enum<?> arbitraryEnum) {
        this.enumOrdinal = arbitraryEnum.ordinal();
    }

    public int getEnumOrdinal() {
        return enumOrdinal;
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        enumOrdinal = in.readInt();
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeInt(enumOrdinal);
    }

    public static void main(String[] args) {
        // use it like this:
        EnumWritable enumWritable = new EnumWritable(EnumName.ENUM_1);
        // let Hadoop do the write and read stuff
        EnumName yourDeserializedEnum = EnumName.values()[enumWritable.getEnumOrdinal()];
    }

}

Obviously it has drawbacks: Ordinals can change, so if you exchange ENUM_2 with ENUM_3 and read a previously serialized file, this will return the other wrong enum.

So if you know the enum class beforehand, you can write the name of your enum and use it like this:

 enumInstance = EnumName.valueOf(in.readUTF());

This will use slightly more space, but it is more save to changes to your enum names.

The full example would look like this:

public class EnumWritable implements Writable {

    static enum EnumName {
        ENUM_1, ENUM_2, ENUM_3
    }

    private EnumName enumInstance;

    // never forget your default constructor in Hadoop Writables
    public EnumWritable() {
    }

    public EnumWritable(EnumName e) {
        this.enumInstance = e;
    }

    public EnumName getEnum() {
        return enumInstance;
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeUTF(enumInstance.name());
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        enumInstance = EnumName.valueOf(in.readUTF());
    }

    public static void main(String[] args) {
        // use it like this:
        EnumWritable enumWritable = new EnumWritable(EnumName.ENUM_1);
        // let Hadoop do the write and read stuff
        EnumName yourDeserializedEnum = enumWritable.getEnum();

    }

}

Upvotes: 4

JB Nizet
JB Nizet

Reputation: 691685

I don't know anything about Hadoop, but based on the documentation of the interface, you could probably do it like that:

public void readFields(DataInput in) throws IOException {
     // do nothing
}

public static SomeEnumType read(DataInput in) throws IOException {
    int value = in.readInt();
    if (value == 0) {
        return SomeEnumType.A;
    }
    else if (value == 1) {
        return SomeEnumType.B;
    }
    else {
        throw new IOException("Invalid value " + value);
    }
}

Upvotes: 0

Related Questions