Reputation: 1627
I would like to implement a neural network framework consisting of layers which can then be composed into a computational graph (see for example caffe). I am using the eigen library for matrices. Eigen distinguishes between vectors and matrices so that for some operations (adding a bias to a matrix) only a vector can be used (and not a matrix with the same dimensions as the vector). For example:
MatrixXf A = MatrixXf(3, 2); // Variables not initialized for brevity
VectorXf v = VectorXf(2);
MatrixXf R1 = A.array().rowwise() + v.transpose().array(); // Broadcasts v correctly
MatrixXf vMat = MatrixXf(1, 2);
MatrixXf R2 = A.array().rowwise() + vMat.array(); // YOU_TRIED_CALLING_A_VECTOR_METHOD_ON_A_MATRIX Error
If I want the layers to look something like this:
void AffineForward(std::vector<Tensor> in, std::vector<Tensor> out)
{
MatrixXf &X = in[0];
MatrixXf &W = in[1];
VectorXf &b = in[2];
out[0] = X * W;
out[0] += b;
}
how would I design the abstract Tensor class so that I can just send in a std::vector of Tensors? I thought about something like this:
class Tensor
{
public:
virtual Tensor operator*(const Tensor &t) const = 0;
};
class TensorMatrix : Tensor
{
public:
TensorMatrix operator*(const TensorMatrix &t) const;
TensorMatrix operator*(const TensorVector &t) const;
MatrixXf _data;
};
class TensorVector : Tensor
{
public:
VectorXf _data;
};
but the virtual Tensor operator* throws a compile time error (function returning abstract class Tensor is not allowed) which makes sense.
What is the easiest way of doing what I want? Creating some class that could be put into a container and I could get both MatrixXf and VectorXf out of it (depending on what the user put in?). Caffe uses something called 'Blob'.
Upvotes: 0
Views: 376
Reputation: 5624
Eigen distinguishes between vectors and matrices so that for some operations (adding a bias to a matrix) only a vector can be used (and not a matrix with the same dimensions as the vector).
this is not true, a vector is a matrix in Eigen; it's just that some operations require dimensions to be known at compile time; in your example
MatrixXf R1 = A.rowwise() + v.transpose();
MatrixXf R2 = A.rowwise() + vMat;
the second line does not compile because that broadcasting needs a matrix with a compile time row-dimensions == 1; the solution is to tell Eigen you want a row vector explictly:
MatrixXf R2 = A.rowwise() + vMat.row(0);
a code working with both row and column vectors stored as MatrixXf being something like ( whether this advisable or not depending on your ultimate requirements )
if( vMat.rows() == 1 )
MatrixXf R1 = A.rowwise() + vMat.rows(0); ...
else if( vMat.cols() == 1 )
MatrixXf R2 = A.rowwise() + vMat.transpose().rows(0); ...
else
whatever...
so, you can always store vectors as matrices with Eigen, you just need some care in telling Eigen what to do with them ...
Upvotes: 2