Reputation: 9379
One of our software is using Eigen (3.2.5) to perform some matric/vector related computations. The software was developed carefully in this regard, starting by disabling all options and optimizations (including using -DEIGEN_DONT_VECTORIZE
), and setting accuracy tests in place.
Since we are now interested in faster numerical throughputs, we have started enabling vectorization inside Eigen. However, we have noticed that one of our tests now gives a slightly different output: the difference with the reference implementation is around 1e-4
, while it was 1e-5
before.
We are going to let loose a bit the precision in this test (because we don't really know the accuracy of the reference data, and we have another test case with synthetic data for which we have an exact solution and that still passes), but out of curiosity: what can be a plausible cause for this variation?
In case it's relevant, this computation involves Euclidean norms.
Upvotes: 1
Views: 104
Reputation: 29205
This has to be expected because when you enable vectorization, floating point operations are not carried out in the exact same order. This typically occurs for expressions involving reductions such as sum, norms, matrix products, etc. For instance, let's consider the following simple sum:
float s = 0;
for(int i=0;i<n;i++)
s += v[i];
A vectorized version might look to something like (pseudo code):
Packet ps = {0,0,0,0};
for(int i=0;i<n;i+=4)
ps += load_packet(&v[i]);
float s = ps[0]+ps[1]+ps[2]+ps[3];
Owing to roundoff errors, each version will return a different value. In Eigen, this aspect even more tricky because reductions are implemented in a way to maximize instruction pipelining.
Upvotes: 4