Reputation: 111
I have the following function:
public void scanText(char[] T){
int q=0;
for(int i=0;i<T.length;i++){
q = transFunc[preCompRow[q]+T[i]];
if(q==pattern.length){
System.out.println("match found at position: "+(i-pattern.length+2));
}
}
}
This function scans a char Array searching for matches of a given pattern, which is stored as a finite automata. The transition function of the automata is stored in the variable called transFunc.
I am testing this function in a text with 8 millions of characters and using 800000 patterns. The thing is the accession of the array preCompRow[q] (which is an int[]) is very slow. The performance is greatly improved if I delete the preCompRow[q] of the code. I think this might be because in every loop the q variable has a different non-sequential value (2, 56, 18, 9 ..).
Is there any better way to access to an array in a non-sequential manner?
Thanks in advance!
Upvotes: 3
Views: 147
Reputation: 718688
One possible explanation is that your code is seeing poor memory performance due to poor locality in its memory access patterns.
The role of the memory caches in a modern computer is to deal with the speed mismatch between processor instruction times (less than 1 ns) and main memory (5 to 10 ns or more). They work best when your code gets a cache hit most time it fetches from memory.
A modern Intel chipset caches memory in blocks of 64 bytes, and loads from main memory in burst mode. (That corresponds to 16 int
values.) The L1 cache on (say) an I7 processor is 2MB.
If your application is able to access the data in a large array (roughly) sequentially, then 7 out of 8 accesses will be a cache hits. If the access pattern is non-sequential and the "working set" of is a large multiple of the cache size, then you may end up with a cache miss on each memory access.
If memory access locality is the root of yoiur problems, then your option are limited:
Recoding your existing in C or C++ may give a performance improvement, but the same memory locality problems will bite you there as well.
I am not aware of any tools that can be used to measure cache performance in Java applications.
Upvotes: 1