Reputation: 6139
I need to emit a 2D double array as key and value from mapper. There are questions posted in Stack Overflow, but they are not answered.
I am doing some of the matrix multiplication in a given dataset, and after that I need to emit the value of A*Atrns
which will be a matrix as key and Atrans*D
which will also be a matrix as value. So how to emit these matrices from mapper. And the value should be corresponding to the key itself.
ie key -----> A*Atrans--------->after multiplication the result will be a 2D array which is declared as double (matrix) lets say the result be Matrix "Ekey"(double[][] Ekey)
value ------> Atrans*D ---------> after multiplication the result will be Matrix "Eval" (double[][] Eval).
After that I need to emit these matrix to reducer for further calculations.
So in mapper:
context.write(Ekey,Eval);
Reducer:
I need to do further calculations with these Ekey and Eval.
I wrote my class:
UPDATE
public class MatrixWritable implements WritableComparable<MatrixWritable>{
/**
* @param args
*/
private double[][] value;
private double[][] values;
public MatrixWritable() {
// TODO Auto-generated constructor stub
setValue(new double[0][0]);
}
public MatrixWritable(double[][] value) {
// TODO Auto-generated constructor stub
this.value = value;
}
public void setValue(double[][] value) {
this.value = value;
}
public double[][] getValue() {
return values;
}
@Override
public void write(DataOutput out) throws IOException {
out.writeInt(value.length); // write values
for (int i = 0; i < value.length; i++) {
out.writeInt(value[i].length);
}
for (int i = 0; i < value.length; i++) {
for (int j = 0; j < value[i].length; j++) {
out.writeDouble(value[i][j]);
}
}
}
@Override
public void readFields(DataInput in) throws IOException {
value = new double[in.readInt()][];
for (int i = 0; i < value.length; i++) {
value[i] = new double[in.readInt()];
}
values = new double[value.length][value[0].length];
for(int i=0;i<value.length ; i++){
for(int j= 0 ; j< value[0].length;j++){
values[i][j] = in.readDouble();
}
}
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + Arrays.hashCode(value);
return result;
}
/* (non-Javadoc)
* @see java.lang.Object#equals(java.lang.Object)
*/
@Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (!(obj instanceof MatrixWritable)) {
return false;
}
MatrixWritable other = (MatrixWritable) obj;
if (!Arrays.deepEquals(value, other.value)) {
return false;
}
return true;
}
@Override
public int compareTo(MatrixWritable o) {
// TODO Auto-generated method stub
return 0;
}
public String toString() { String separator = "|";
StringBuffer result = new StringBuffer();
// iterate over the first dimension
for (int i = 0; i < values.length; i++) {
// iterate over the second dimension
for(int j = 0; j < values[i].length; j++){
result.append(values[i][j]);
result.append(separator);
}
// remove the last separator
result.setLength(result.length() - separator.length());
// add a line break.
result.append(",");
}
return result.toString();
}
}
I am able to emit a value as matrix from mapper
context.write(...,new MatrixWritable(AAtrans));
How to emit matrix AtransD as key from mapper?
For that I need to write compareto() method, right?
What should be included in that method?
Upvotes: 2
Views: 7498
Reputation: 32949
First, to implement a custom key you must implement WritableComparable
. To implement a custom value you must implement Writable
. In many cases since it is handy to be able to swap keys and values most people write all custom types as WritableComparable
.
Here is a link to the section of Hadoop: The Definitive Guide
that covers writing a WritableComparable
. Writing A Custom Writable
The trick with writing out an array is that on the read side you need to know how many elements to read. So the basic patter is...
On write:
write the number of elements
write each element
On read:
read the number of elements (n)
create an array of the appropriate size
read 0 - (n-1) elements and populate array
Update
You should instantiate your array as empty in the default constructor to prevent a NullPointerException later.
The problem with your implementation is that it assumes that each inner array is of the same length. If that is true, you don't need to calculate the column length more than once. If false, you need to write the length of each row before writing the values of the row.
I would suggest something like this:
context.write(row); // as calculated above
for (int i=0; i<row; i++){
double[] rowVals = array[row];
context.write(rowVals.length);
for (int j=0; j<rowVals.length; j++)
context.write(rowVals[j]);
}
Upvotes: 2