Reputation: 1415
I have the following loop for a monte-carlo computation I am performing:
the variables below are pre-computed/populated and is defined as:
w_ = std::vector<std::vector<double>>(150000, std::vector<double>(800));
C_ = Eigen::MatrixXd(800,800);
Eigen::VectorXd a(800);
Eigen::VectorXd b(800);
The while loop is taking me about 570 seconds to compute.Just going by the the loops I understand that I have nPaths*m = 150,000 * 800 = 120,000,000 sets of computations happening (I have not taken into account the cdf computations handled by boost libraries).
I am a below average programmer and was wondering if there are any obvious mistakes which I am making which maybe slowing the computation down. Or is there any other way to handle the computation which can speed things up.
int N(0);
int nPaths(150000);
int m(800);
double Varsum(0.);
double err;
double delta;
double v1, v2, v3, v4;
Eigen::VectorXd d = Eigen::VectorXd::Zero(m);
Eigen::VectorXd e = Eigen::VectorXd::Zero(m);
Eigen::VectorXd f = Eigen::VectorXd::Zero(m);
Eigen::VectorXd y;
y0 = Eigen::VectorXd::Zero(m);
boost::math::normal G(0, 1.);
d(0) = boost::math::cdf(G, a(0) / C_(0, 0));
e(0) = boost::math::cdf(G, b(0) / C_(0, 0));
f(0) = e(0) - d(0);
while (N < (nPaths-1))
{
y = y0;
for (int i = 1; i < m; i++)
{
v1 = d(i - 1) + w_[N][(i - 1)]*(e(i - 1) - d(i - 1));
y(i - 1) = boost::math::quantile(G, v1);
v2 = (a(i) - C_.row(i).dot(y)) / C_(i, i);
v3 = (b(i) - C_.row(i).dot(y)) / C_(i, i);
d(i) = boost::math::cdf(G, v2);
e(i) = boost::math::cdf(G, v3);
f(i) = (e(i) - d(i))*f(i - 1);
}
N++;
delta = (f(m-1) - Intsum) / N;
Intsum += delta;
Varsum = (N - 2)*Varsum / N + delta*delta;
err = alpha_*std::sqrt(Varsum);
}
Upvotes: 2
Views: 949
Reputation: 34337
If I understand your code right, the running time is actually O(nPaths*m*m)=10^11
, due to the dot-product C_.row(i).dot(y)
which needs O(m)
operation.
You could speed up the program by factor of two by not calculating it twice:
double prod=C_.row(i).dot(y)
v2 = (a(i) - prod) / C_(i, i);
v3 = (b(i) - prod) / C_(i, i);
but maybe compiler already does it for you.
The other thing is that y
consists of zeros (at least at the beginning) so you don't have to do the full dot-product but only until current value of i
. That should give another factor 2 speed up.
So taken into the account the sheer number of operation your timings are not so bad. There is some room for improvement of the code, but if you are interested in speeding up some orders of magnitude you probably should be thinking about changing your formulation.
Upvotes: 1