Omer Enes
Omer Enes

Reputation: 93

Efficiently convert large BigInteger to string, ToString() takes too long

I was working on a factorial function, and factorial with big integers can get REALLY long.

For example, 200000! = 973350 digits long and if i just use ToString(), things take a very long time.

200000! takes longer to convert to string then actually compute it!

I've tried to set the factorial function to a Process, and then use ProcessorAffinity to pin the Thread to a specific core so that that core ONLY converts to string, but that just took exactly the same time.

Also, the reason I want to convert it to string is because I want to write the output (FactFile) to a text file.

String Conversion code:

using (Process proc = Process.GetCurrentProcess())
{
     proc.ProcessorAffinity = (IntPtr)0x0003;
     FactFile = Res.ToString(); //FactFile is Going to be the final string, Res is the Factorial.
}

here's my code for Factorial:

for (BigInteger i = 1; i < k; i++)
{
    Res *= i; // Res is the number to Calculate the factorial of
}

200000! takes 15 seconds to compute, and then another 18 seconds to convert it to string (this can differ from cpu to cpu, I have an i7).

Reminder: What's the most efficient way to convert to string?

output:

Total Computation Time: 00:00:16.2007276
String Conversion Time: 00:00:19.4049292

Upvotes: 0

Views: 1038

Answers (2)

Theodor Zoulias
Theodor Zoulias

Reputation: 43748

Here is a 40% faster stringifier. It divides the big integer repeatedly with the number 10^10000, stringifies the remainder, and finally joins all the strings together. It can handle negative numbers also.

public static string ToDecimalString(this BigInteger value)
{
    if (value == 0) return "0";
    var digits = 10000;
    var divider = BigInteger.Pow(10, digits);
    var parts = new Stack<string>();
    while (true)
    {
        BigInteger remainder;
        value = BigInteger.DivRem(value, divider, out remainder);
        if (value != 0)
        {
            parts.Push(BigInteger.Abs(remainder).ToString().PadLeft(digits, '0'));
        }
        else
        {
            parts.Push(remainder.ToString());
            break;
        }
    }
    return String.Join("", parts);
}

It can become slightly faster by offloading the stringifying of the remainders to a background thread. Unfortunately the slowest part of the algorithm (the call to BigInteger.DivRem) is not parallelizable.

public static string ToDecimalStringParallel(this BigInteger value)
{
    if (value == 0) return "0";
    var digits = 10000;
    var divider = BigInteger.Pow(10, digits);
    var remainders = new BlockingCollection<BigInteger>();
    var parts = new ConcurrentStack<string>();
    var task = Task.Run(() =>
    {
        foreach (var remainder in remainders.GetConsumingEnumerable())
        {
            parts.Push(BigInteger.Abs(remainder).ToString().PadLeft(digits, '0'));
        }
    });
    while (true)
    {
        BigInteger remainder;
        value = BigInteger.DivRem(value, divider, out remainder);
        if (value != 0)
        {
            remainders.Add(remainder);
        }
        else
        {
            remainders.CompleteAdding();
            task.Wait();
            parts.Push(remainder.ToString());
            break;
        }
    }
    return String.Join("", parts);
}

Upvotes: 2

Peter Wishart
Peter Wishart

Reputation: 12310

For n! above n = 24, the result will have more digits than the value of input n.

That explosion of digits causes ToString to have to do more work than the factorial (n-1 multiplies vs more than n divides).

However because BigInteger operations take longer based on the magnitude of the numbers they work on, you can divide by a factor first, e.g. for your 200000! example:

var asString = Res.ToString()

Versus:

BigInteger[] bits = new BigInteger[2];
bits[0] = BigInteger.DivRem(Res,  BigInteger.Pow(new BigInteger(10), 486675), out var remainder);
bits[1] = remainder;
var strs = new string[2];
System.Threading.Tasks.Parallel.For(0, 2, (i) => 
{
    strs[i] = bits[i].ToString();
});
var asString = strs[0] + strs[1];

I found this to be 5 seconds faster but:

  • I had to choose a factor of 10^486675 to divide the digits equally - not sure how you'd do that in general
  • The initial DivRem takes 8 seconds

See also some existing attempts to speed up both factorial calculation and bigint to base 10 by breaking up the work.

The conclusion of the second one seemed to be.. don't convert large BigIntegers to base 10!

Upvotes: 1

Related Questions