Reputation: 272
I have a program which simply removes duplicate elements of a character array using HashSet.
Here is my program:
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
public class MainClass {
public static void main(String[] arg) {
double sT = System.nanoTime();
Character[] data = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h',
'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f',
'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r',
's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd',
'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p',
'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b',
'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h',
'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
'u', 'v', 'w', 'x', 'y', 'z' };
Set<Character > uniqueSet = new HashSet<Character>(Arrays.asList(data));
Character[] strArr = new Character[uniqueSet.size()];
uniqueSet.toArray(strArr);
for(Character str:strArr){
System.out.println(str);
}
System.out.println(System.nanoTime() - sT);
}
}
It gives the desired output. But problem is execution time. Is there any ways that I can implement in my program to reduce its execution time?
Upvotes: 4
Views: 553
Reputation: 200158
Let Google Caliper take care of microbenchmarking:
0% Scenario{vm=java, trial=0, benchmark=Array} 12868.77 ns; σ=523.07 ns @ 10 trials
us
12.9
vm: java
trial: 0
benchmark: Array
12.9 microseconds. Not quite the same as your 1319 microseconds, is it :)
For reference, this is the exact code:
import com.google.caliper.Runner;
import com.google.caliper.SimpleBenchmark;
public class Performance extends SimpleBenchmark {
public int timeArray(int reps) {
int sum = 0;
for (int rep = 0; rep < reps; rep++) {
Character[] data = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p',
'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i',
'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b',
'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's',
't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e',
'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
'y', 'z' };
sum += new HashSet<>(Arrays.asList(data))
.toArray(new Character[new HashSet<Character>
(Arrays.asList(data)).size()]).length;
}
return sum;
}
public static void main(String... args) {
Runner.main(Performance.class, args);
}
}
Upvotes: 2
Reputation: 533500
The first thing to realise is that writing to the console is very expensive. If you remove
System.out.println(str);
This will speed up the code dramatically.
The other thing to note is that you code isn't run long enough to warm up the code (it won't be compiled) You should run the test for about 2 - 10 seconds to see how long it would take with warmed up code.
The time taken
End to end that is a 2000x fold performance improvement ;)
The final code looks like
StringBuilder strArr = null;
long sT = 0;
int runs = 1000000;
for (int i = -40000; i < runs; i++) {
if (i == 0)
sT = System.nanoTime();
String text = "qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm" +
"qwertyuiopasdfghjklzxcvbnm";
BitSet bs = new BitSet(256);
for (int j = 0, len = text.length(); j < len; j++)
bs.set(text.charAt(j));
strArr = new StringBuilder();
for (int j = -1; (j = bs.nextSetBit(j+1)) >= 0; )
strArr.append((char) j);
}
System.out.printf("Took an average of %,d ns%n", (System.nanoTime() - sT) / runs);
System.out.print(strArr);
prints
Took an average of 1,287 ns
abcdefghijklmnopqrstuvwxyz
Upvotes: 3
Reputation: 70929
As the different types of elements that you can have is really small, you can easily use a simple array instead of a hashset(an approach similar to set or counting sort). If you only care for the non-capital English letters declare an array boolean met[26];
, if you need to be able to support all characters use an boolean met[256];
.
Than iterate over the array and only add a character to the result if its met
value is false. When adding the character to the result don't forget to mark it as used.
No hashing involved and thus - better performance.
EDIT: as it seems there is some confusion with what I mean I will try to add a code sample
boolean met[] = new boolean[256]; // Replace 256 with the size of alphabet you are using
List<Character> res = new ArrayList<Character>();
for(Character c:data){
int int_val = (int)c.charValue();
if (!met[int_val]) {
met[int_val] = true;
res.add(c);
}
}
// res holds the answer.
Upvotes: 5
Reputation: 2137
You can use Google Guava to do it for you. There is a Stackoverflow discussion about it.
Upvotes: 0