Ace Player
Ace Player

Reputation: 73

Very large Java ArrayList has slow traversal time

Solution: My ArrayList was filled with duplicates. I modified my code to filter these out, which reduced running times to about 1 second.

I am working on a algorithms project that requires me to look at large amounts of data.

My program has a potentially very large ArrayList (A) that has every element in it traversed. For each of these elements in (A), several other, calculated elements are added to another ArrayList (B). (B) will be much, much larger than (A).

Once my program has run through seven of these ArrayLists, the running time goes up to approximately 5 seconds. I'm trying to get that down to < 1 second, if possible.

I am open to different ways to traverse the ArrayList, as well as using a completely different data-structure. I don't care about the order of the values inside the lists, as long as I can go through all values, very fast. I have tried a linked-list and it was significantly slower.

Here is a snippet of code, to give you a better understanding. The code tries to find all single-digit permutations of a prime number.

public static Integer primeLoop(ArrayList current, int endVal, int size)
{        
    Integer compareVal = 0;
    Integer currentVal = 0;
    Integer tempVal = 0;
    int currentSize = current.size()-1;

    ArrayList next = new ArrayList();

    for(int k = 0; k <= currentSize; k++)
    {
        currentVal = Integer.parseInt(current.get(k).toString());
        for(int i = 1; i <= 5; i++)
        {                                
            for(int j = 0; j <= 9; j++)
            {
                compareVal = orderPrime(currentVal, endVal, i, j);
                //System.out.println(compareVal);

                if(!compareVal.equals(tempVal) && !currentVal.equals(compareVal))
                {     
                    tempVal = compareVal;
                    next.add(compareVal);

                    //System.out.println("Inserted: "+compareVal + "  with parent:  "+currentVal);

                    if(compareVal.equals(endVal))
                    {
                        System.out.println("Separation: " + size);
                        return -1;
                    }
                }
            }
        }
    }
    size++;
    //System.out.println(next);
    primeLoop(next, endVal, size); 
    return -1;
}

*Edit: Removed unnecessary code from snippet above. Created a currSize variable that stops the program from having to call the size of (current) every time. Still no difference. Here is an idea of how the ArrayList grows: 2, 29, 249, 2293, 20727, 190819,

Upvotes: 1

Views: 2212

Answers (3)

Thomas Kl&#228;ger
Thomas Kl&#228;ger

Reputation: 21435

Here is an idea of how the ArrayList grows: 2, 29, 249, 2293, 20727, 190819

Your next list grows too large, so it must contain duplicates:

  • 190_819 entries for 100_000 numbers?
  • According to primes.utm.edu/howmany.html there are only 9,592 primes up to 100_000.

Getting rid of the duplicates will certainly improve your response times.

Upvotes: 1

Stuart Marks
Stuart Marks

Reputation: 132360

When something is slow, the typical advice is to profile it. This is generally wise, as it's often difficult to determine what's the cause of slowness, even for performance experts. Sometimes it's possible to pick out code that's likely to be a performance problem, but this is hit-or-miss. There are some likely things in this code, but it's hard to say for sure, since we don't have the code for the orderPrime() and primeLoop() methods.

That said, there's one thing that caught my eye. This line:

    currentVal = Integer.parseInt(current.get(k).toString());

This gets an element from current, turns it into a string, parses it back to an int, and then boxes it into an Integer. Conversion to and from String is pretty expensive, and it allocates memory, so it puts pressure on garbage collection. Boxing primitive int values to Integer objects also allocates memory, contributing to GC pressure.

It's hard to say what the fix is, since you're using the raw type ArrayList for current. I surmise it might be ArrayList<Integer>, and if so, you could just replace this line with

    currentVal = (Integer)current.get(k);

You should be using generics in order to avoid the cast. (But that doesn't affect performance, just the readability and type-safety of the code.)

If current doesn't contain Integer values, then it should. Whatever it contains should be converted to Integer beforehand, instead of putting conversions inside a loop.

After fixing this, you are still left with boxing/unboxing overhead. If performance is still a problem, you'll have to switch from ArrayList<Integer> to int[] because Java collections cannot contain primitives. This is inconvenient, since you'll have to implement your own list-like structure that simulates a variable-length array of int (or find a third party library that does this).

But even all of the above might not be enough to make your program run fast enough. I don't know what your algorithm is doing, but it looks like it's doing linear searching. There are a variety of ways to speed up searching. But another commenter suggested binary search, and you said it wasn't allowed, so it's not clear what can be done here.

Upvotes: 2

Eva
Eva

Reputation: 449

  1. Why you have this line

    current.iterator();

You don't use the iterator at all, you don't even have a variable for it. It's just waisting of time.

  1. for(int k = 0; k <= current.size()-1; k++)

Instead of counting size every iteration, create value like:

int curSize = current.size() - 1;

And use it in loop.

It can save some time.

Upvotes: 1

Related Questions