Reputation: 1
I have 2 doubles and I want to add them, divide them etc but everything returns inf
double num1 = 1.999999999999999999e+320
double num_2 =1.999999999999999e+320
Are they out of range of double
? How can I extend it or solve the problem?
Upvotes: 0
Views: 10186
Reputation: 12331
If you just to do simple operations like addition, multiplication, derivation there is no need to use a third party library. You could create your own class that handle such numbers and the operations you want.
From wikipedia's article on scientific notation :
Scientific notation is a way of writing numbers that accommodates values too large or small to be conveniently written in standard decimal notation. Scientific notation has a number of useful properties and is commonly used in calculators, and by scientists, mathematicians, doctors, and engineers.
In scientific notation all numbers are written like this:
a \times 10^b
("a times ten raised to the power of b"), where the exponent b is an integer, and the coefficient a is any real number
So for your class you need a double corresponding to the coefficient a and an int or long int (for even larger number) that represents the exponent b.
Let two numbers N1 = a1E+b1 and N2 = a2E+b2
Then we can handle the four classical arithmetic operations as following:
N1*N2 = a1*a2E+(b1+b2)
N1/N2 = a1/a2E+(b1-b2)
Of course you should handle division by zero.
You need some basic algebra to generalize it
if (bi >= b2)
N1+N2 = a1E+b1 + a2E+b2 = a1E+b1 + a2E+(b1+b2-b1) = (a1+a2E+(b2-b1))E+b1
else
N1+N2 = a1E+b1 + a2E+b2 = a1E+(b2+b1-b2) + a2E+b2 = (a1E+(b1-b2)+a2E)E+b2
EDIT
You should transform the left part of the above equations to double, then transform it to scientific notation again and apply the multiplication rule, eg
a1+a2E+(b2-b1) = a3E+b3, so
N1+N2 = a3E+b3E+b1 = a3E+(B3+b1)
Similarly to addition we have for b1 >= b2
N1-N2 = (a1-a2E+(b2-b1))E+b1
You will need the following:
A skeleton follows, the actual implementation is very easy:
class MyLargeNumber{
public:
MyLargeNumber(double d); // from d find a,b and initialize your object
MyLargeNumber(double a, long int B); // initialize directly
double a() const; // get the coefficient
long int b() const; // get the exponent
// Operators overloading
MyLargeNumber operator+(const MyLargeNumber &m) const;
MyLargeNumber operator-(const MyLargeNumber &m) const;
MyLargeNumber operator*(const MyLargeNumber &m) const;
MyLargeNumber operator/(const MyLargeNumber &m) const;
// Helper function
std::string toString() const;
private:
double a; // the coefficient
long int b; // the exponent
}
Upvotes: 0
Reputation: 882626
Doubles (double precision IEEE754) will only get you up to about 10+/-308
(from memory).
If you have an implementation that supports a wider long double
type, you can use that. Now keep in mind that C99 implementations are allowed to treat long double
as identical to double
so this may not necessarily help you. From C99:
The C floating types match the IEC 60559 formats as follows:
- The float type matches the IEC 60559 single format.
- The double type matches the IEC 60559 double format.
- The long double type matches an IEC 60559 extended format, else a non-IEC 60559 extended format, else the IEC 60559 double format.Any non-IEC 60559 extended format used for the long double type shall have more precision than IEC 60559 double and at least the range of IEC 60559 double.
'Extended' is IEC 60559’s double-extended data format. Extended refers to both the common 80-bit and quadruple 128-bit IEC 60559 formats.
A non-IEC 60559 long double type is required to provide infinity and NaNs, as its values include all double values.
But, if it uses the extended formats (e.g., 80 or 128-bit formats), that will give you a massive increase in range from the 64-bit double. The IEEE754 binary128 format will give you about 34 decimal digits of precision (up from the 15 you get from double) and a range of about 10+/-4932
(up from 10+/-308
).
If it doesn't, or if that's still not enough range or precision, you can look into one of the arbitrary-precision libraries, like MPIR which, despite it's name, is perfectly capable of handling real floating point numbers (not just integers and rationals).
Upvotes: 2
Reputation: 705
The long double
data type does indeed have a greater range. For example, on my machine (64-bit linux), I get the following information:
Maximum value for double: 1.79769e+308
Maximum value for long double: 1.18973e+4932
Notice the larger exponent.
This information was found using the limits library in the C++ STL. An example can be found here.
Upvotes: 1
Reputation: 22412
Use arbitrary precision mathematics library. Have a look at the Arbitrary Precision Arithmetic for links to a number of them.
Upvotes: 1
Reputation: 63
Have you tried long double or float? why would you need such a long number anyway
Upvotes: 0