Reputation: 335
for(int i=0; i<len ;i++ ){
Set<Integer> fileTerm = new HashSet<Integer>();
....
}
this set will be huge for each iteration
next way is that we put the creation function outside the loop and clear every time
Set<Integer> fileTerm = new HashSet<Integer>();
for(int i=0; i<len ;i++ ){
....
fileTerm.clear();
}
Upvotes: 1
Views: 76
Reputation: 469
I made a simple test (no warm up, no real data, just a small demonstration):
int sum = 0;
long start = System.currentTimeMillis();
// Set<Integer> set = new HashSet<Integer>(); // (2)
for (int i = 0; i < 100_000; i++) {
Set<Integer> set = new HashSet<Integer>(); // (1)
// Set<Integer> set = new HashSet<Integer>(5_000); // (3)
for (int j = 0; j < 5_000; j++) {
set.add(j);
}
sum += set.contains(78285) ? 1 : 0;
sum += set.contains(85) ? 1 : 0;
// set.clear(); // (2)
}
System.out.println((System.currentTimeMillis() - start) + "ms");
System.out.println(sum);
Times in seconds (JDK 1.7.0_25 32bit)
(1) 24 23.9 24 - your 1st option
(2) 18.8 18.6 18.7 - your 2nd option
(3) 18.4 18.4 18.3 - set the initial capacity to 5000
Upvotes: 1
Reputation: 691
In my opinion, the second way is more better than the first. Because in the second way, created Set object only once but in the first way create Set object in every looping. That why second is more better.
Upvotes: 0
Reputation: 200256
The key difference between creating a new set and reusing the old one by clearing is that clearing does not reduce the hashtable capacity back to the initial setting. In your case this is probably a good thing, but the savings are minimal. You probably do enough work in that loop that this will be unnoticeable.
On the other hand, creating a new set each time makes your code more robust and easier to reason about. If you ever introduce continue
statement, forgetting to clear()
before it, you get a broken program.
Upvotes: 1