Reputation: 13048
Which arithmetic instruction set operation is the slowest and the fastest on IA-32, IA-64? Are there any ranking? Benchmarks?
Upvotes: 1
Views: 1001
Reputation: 6809
Generally speaking these are the square-root and division instructions especially for the scalar floating point pipeline.
For IA-32 and IA-64 specifically you might want to look at the Intel(R) IA-64 and IA-32 Architectures Optimization Reference Manual which has cycle counts for each instruction on different processors in Appendix C. You'll see that the SIMD equivalent instructions perform much better at a cost of less precision and they operate on 4 elements at a time. If you need more precision for the square-root and reciprocal-square-root you'll have to manually do that with an extra Newton-Raphson step.
Upvotes: 7
Reputation: 21399
Ummm, ADD & SUB are very fast. Any of the "partial" floating point ops are going to be very slow (which is why they're "partial" and may have to be called multiple times to finish).
Upvotes: 1