Danish
Danish

Reputation: 3798

Java Arrays.sort() Taking a Long Time

I am using Java's Arrays.sort() function to sort a list of files by their last modified time. The sort for 245 files is taking around 5 seconds. This seems too long to me. I feel it shouldn't take more than 0.5 seconds. Is that a good assumption? What am I doing wrong? OR does this sound normal?

public static class LastModifiedComparator implements Comparator<File> {
    @Override
    public int compare(File f1, File f2) {
        return (int)(f1.lastModified() - f2.lastModified());
    }       
}

File folder = new File( "C:\\Whatever\\" );
File[] filesInFolder = folder.listFiles();
logger.debug("Starting File Sort");
Arrays.sort(filesInFolder, new LastModifiedComparator());
logger.debug("Done File Sort");

Output in Log

2012-08-10 14:24:20,333 DEBUG http-8080-4 <ClassName>:73 - Starting File Sort
2012-08-10 14:24:25,915 DEBUG http-8080-4 <ClassName>:75 - Done File Sort

Upvotes: 16

Views: 5238

Answers (5)

Marko Topolnik
Marko Topolnik

Reputation: 200148

You will need to improve your Comparator logic. You need to cache the lastModified() values because the implementation of that method is quite slow. I suggest wrapping the File instances into a comparable object of your making that will cache the value:

public class FileLmWrapper implements Comparable<FileLmWrapper> {
  public final File f;
  public final long lastModified;
  public FileLmWrapper(File f) { 
    this.f = f; 
    lastModified = f.lastModified();
  }
  public int compareTo(FileLmWrapper other) {
    return Long.compare(this.lastModified, other.lastModified);
  }
}

Upvotes: 24

sundar
sundar

Reputation: 1760

The sort implemented in java in modified Quick Sort tuned Merge Sort which will have average running time complexity of O(nlogn).
So we need to concentrate on your File operations like getting lastModifiedTime. Are you sure those files are local files or shared drive which takes the network latency?

Upvotes: 0

yshavit
yshavit

Reputation: 43391

File.lastModified has to go to the OS to query when the file was last modified -- it's not cached. You're doing that twice per comparison, and Arrays.sort uses a mergesort -- O(n log n). Plugging in 245 for n, That's about 580 comparisons, or 1100 calls to the OS to get the last modified time. That means you can get about 230 last-modified calls per second. That seems maybe a bit slow, but certainly more plausible than an in-JVM comparison taking that long

As Marko Topolnik abd NgSan points out above, the fix would be to first cache the last-modified times for all the Files. I would do it by creating a new class object that combines the File and that time, and then sort those objects. That way you'll have only 245 calls to File.lastModified, and the sort should take about 1/5th the time.

Upvotes: 23

Bharat Sinha
Bharat Sinha

Reputation: 14363

Your compare operation

@Override
public int compare(File f1, File f2) {
    return (int)(f1.lastModified() - f2.lastModified());
}  

isn't just a getter but issues a call to get the information from file system so the high response time of sort is moreover due to the performance of lastModified() than compare().

Upvotes: 1

MgSam
MgSam

Reputation: 12803

I don't know for sure, but it sounds like it's doing disk I/O each time you read the Modified Time- thus the slowness. It might be faster to simply get the modified times in an object along with the File object, and then sort.

Upvotes: 1

Related Questions