Reputation: 1374
I made a median filter algorithm and I want to optimize it. Currently it's taking around 1 second to filter 2MM lines(a file read into an ArrayList elements
) and I am trying to reduce it to less(maybe half the time?) I'm using ArrayLists for my algorithm and minimized the use of nested loops as well to avoid an increase in time, however I still can't achieve lower than 0.98 seconds tops.
Here's a code snippet that does the median filter:
//Start Filter Algorithm 2
int index=0;
while(index<filterSize){
tempElements.add(this.elements.get(index+counter)); //Add element to a temporary arraylist
index+=1;
if(index==filterSize){
outputElements.add(tempElements.get((filterSize-1)/2)); //Add median Value to output ArrayList
tempElements.clear(); //Clear temporary ArrayList
index = 0; //Reset index
counter+=1; //Counter increments by 1 to move to start on next element in elements ArrayList
}
if(elementsSize-counter <filterSize){
break; //Break if there is not enough elements for the filtering to work
}
}
What's happening is that I'm looping through the elements
arraylist for the filterSize
I provided. Then I add the elements to a temporary(tempElements
) arraylist, sort it using Collections.sort()
(this is what I want to avoid), find the median value and add it to my final output arraylist. Then I clear the tempElements
arraylist and keep going through my loop until I cannot filter anymore due to the lack of elements(less than filterSize
).
I'm just looking for a way to optimize it and get it faster. I tried to use a TreeSet but I cannot get the value at an index from it.
Thanks
Upvotes: 3
Views: 4200
Reputation: 28332
As an alternative to Giovanni Botta's excellent answer:
Suppose that you have an array [7, 3, 8, 4, 6, 6, 2, 4, 6]
and a filterSize
of 4. Then our first temp array will be [7, 3, 8, 4]
and we can sort it to get [3, 4, 7, 8]
. When we compute our next temporary array, we can do it in linear (or better?) time as follows:
We can repeat this for all temp arrays after the initial sort. If you're spending a lot of time sorting subarrays, this might not be a bad way to go. The trick is that it increases required storage since you need to remember the order in which to remove the entries, but that shouldn't be a big deal (I wouldn't think).
Upvotes: 2
Reputation: 9816
The Java Collections.sort()
implementation is as fast as it gets when it comes to sorting (dual pivot quick sort).
The problem here is not in the nitty gritty details but the fact that you are sorting at all! You only need to find the median and there are linear algorithms for that (sorting is log-linear). See selection for some inspiration. You might need to code it yourself since I don't think the java library has any public implementation available.
The other thing I suggest is to use a fixed size array (created once) instead of an ArrayList
. Since you know the size of the filter beforehand that will give you a small speed boost.
Also I don't see how avoiding for loops helps performance in any way. Unless you profiled it and proved that it's the right thing to do, I would just write the most readable code possible.
Finally, TreeSet
or any other kind of sorted data structure won't help either because the time complexity is log-linear for n insertions.
Upvotes: 9