Reputation: 26085
In a computer contest, I was given a problem where I had to manipulate input data. The input has been split() into an array where data[0] is the number of repetitions. There can be up to 10^18 repetitions. My program returned Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
and I failed the contest.
Here's a piece of my code that's eating up memory and CPU:
long product[][]=new long[data[0]][2];
product[0][0]=data[1];
product[0][1]=data[2];
for(int a=1;a<data[0];a++){
product[a][0]=((data[5]*product[a-1][0] + data[6]) % data[3]) + 1; // Pi = ((A*Pi-1 + B) mod M) + 1 (for all i = 2..N)
product[a][1]=((data[7]*product[a-1][1] + data[8]) % data[4]) + 1; // Wi = ((C*Wi-1 + D) mod K) + 1 (for all i = 2..N)
}
Here's some of the input data:
980046644627629799 9 123456 18 10000000 831918484 451864686 840000324 650000765
972766173386786486 123 1 10000000 10000000 590000001 680000000 610000001 970000002
299896237124947938 681206 164538 2280874 981991 416793690 904023823 813682336 774801135
My program can only work up to about 7 or 8 digits, then it takes minutes to run. With 18 digits, it crashed almost as soon as I clicked "Run" in Eclipse.
I'm curious as to how is it possible to manipulate that much data on a normal computer. Please let me know if my question is unclear or you need more information. Thanks!
Upvotes: 1
Views: 395
Reputation: 15729
You can't have, and don't need, an array of such a huge length. You just need to track the most recent 2values. E.g., just have product1 and product2.
Also, consider testing if either number is a NaN after each iteration. If so, throw an Exception and give the iteration number. Because once you get a NaN they will all be NaN. Except you are using long, so scratch that. "Nevermind". :-)
Upvotes: 3
Reputation: 22969
Let's put the numbers into perspective.
Memory: One long
takes 8 bytes. 1018 long
s take 16,000,000 terabytes. Way too much.
Time: 10,000,000 operations ≈ 1 second. 1018 steps ≈ 30 centuries. Also way too much.
You can solve the memory problem by realising that you only need the most recent values at any time, and that the entire array is redundant:
long currentP = data[1];
long currentW = data[2];
for (int a = 1; a < data[0]; a++)
{
currentP = ((data[5] * currentP + data[6]) % data[3]) + 1;
currentW = ((data[7] * currentW + data[8]) % data[4]) + 1;
}
The time problem is a bit trickier to solve. Since modulus is used, you can observe that the numbers must enter a cycle at some point. Once you find the cycle, you can predict what the value will be after n iterations without having to do each iteration manually.
The simplest method for finding cycles is to keep track of whether or not you visited each element, and then go through until you encounter an element you've seen before. In this situation, the amount of memory required is proportional to M and K (data[3] and data[4]). If they are too large, a more space-efficient cycle detection algorithm must be used.
Here is an example which finds the value for P:
public static void main(String[] args)
{
// value = (A * prevValue + B) % M + 1
final long NOT_SEEN = -1; // the code used for values not visited before
long[] data = { 980046644627629799L, 9, 123456, 18, 10000000, 831918484, 451864686, 840000324, 650000765 };
long N = data[0]; // the number of iterations
long S = data[1]; // the initial value of the sequence
long M = data[3]; // the modulus divisor
long A = data[5]; // muliply by this
long B = data[6]; // add this
int max = (int) Math.max(M, S); // all the numbers (except first) must be less than or equal to M
long[] seenTime = new long[max + 1]; // whether or not a value was seen and how many iterations it took
// initialize the values of 'seenTime' to 'not seen'
for (int i = 0; i < seenTime.length; i++)
{
seenTime[i] = NOT_SEEN;
}
// find the cycle
long count = 0;
long cycleValue = S; // the current value in the series
while (seenTime[(int)cycleValue] == NOT_SEEN)
{
seenTime[(int)cycleValue] = count;
cycleValue = (A * cycleValue + B) % M + 1;
count++;
}
long cycleLength = count - seenTime[(int)cycleValue];
long cycleOffset = seenTime[(int)cycleValue];
long result;
if (N < cycleOffset)
{
// Special case: requested iteration occurs before the cycle starts
// Straightforward simulation
long value = S;
for (long i = 0; i < N; i++)
{
value = (A * value + B) % M + 1;
}
result = value;
}
else
{
// Normal case: requested iteration occurs inside the cycle
// Simulate just the relevant part of one cycle
long positionInCycle = (N - cycleOffset) % cycleLength;
long value = cycleValue;
for (long i = 0; i < positionInCycle; i++)
{
value = (A * value + B) % M + 1;
}
result = value;
}
System.out.println(result);
}
I am only giving you the solution because it looks like the contest is over. The important lesson to learn from this is that you should always check the bounds to see whether your solution is practical before you start coding it up.
Upvotes: 0
Reputation: 38531
long product[][]=new long[data[0]][2];
This is the only line in the code you pasted that allocates memory. You allocate an array whose length will be data[0]
in length! As data grows, so does the array. What is the formula you're trying to apply here?
The first input data you provide :
980046644627629799
is already too large to even declare an array for. Try creating a single dimension array with that as its length and see what happens....
Are you sure you don't just want a 1 x 2 matrix that you accumulate over? Explain your intended algorithm clearly and we can help you with a more optimal solution.
Upvotes: 2