Reputation: 1541
I'm basically benchmarking some high speed string matching algorithms, I came across a few.
Backwards Non-deterministic DAWG (Directed acyclic word graph) Matching algorithm by Gonzalo Navarro and Mathieu Raffinot. See "A Bit-Parallel Approach to Suffix Automata: Fast Extended String Matching"
Horspool's improved version of the Boyer-Moore String searching algorithm. See "Practical fast searching in strings"
Shift-Or algorithm with mismatches
Are there any other better high speed string matching algorithms i can try ?
Edit : There is another thread in similar lines , which has good references too
Upvotes: 14
Views: 3058
Reputation: 1
import java.util.Scanner;
public class StringMatch {
static int temp,i=0,j=0; static boolean flag=true,matcher=false;
static String str=null,mstr=null;static char astr[],amstr[];
static void getter(){
Scanner sc = new Scanner(System.in);
str = sc.nextLine();
//String str="today is Monday";
astr=str.toCharArray();
mstr = sc.nextLine();
//String mstr="is";
amstr=mstr.toCharArray();
}
static void stringMatch(){
while(i<astr.length){
if(astr[i]==amstr[j]){
while((j!=amstr.length)&&flag){temp=i;
if(astr[i]!=amstr[j]) {flag=false;matcher=false;}
else{matcher=true;}
i++;j++;
//System.out.println(i+"\t"+j);
}if(matcher==true)break;i=temp;}i++;j=0;flag=true;
}
if(matcher==true) {System.out.println("true");}
else {System.out.println("false");}
}
public static void main(String[] args) {
StringMatch.getter();
StringMatch.stringMatch();
}
}
Upvotes: -1
Reputation: 14205
Two-way string matching is, to my knowledge, the best general-purpose algorithm for string matching. It has linear worst-case complexity, uses constant space, and doesn't backtrack too much more than necessary. And the theory behind it is very nice.
If you know that your users aren't jerks, naive string matching optimised for your architecture will win for short "needles" while a Boyer-Moore variant will start really doing the sublinear thing for long "needles." However, naive string matching has a quadratic worst case and Boyer-Moore can be made to examine all characters in the input. The extra tables needed to handle mismatches actually carry a surprisingly severe penalty over two-way string matching.
Upvotes: 2
Reputation: 31604
You can also try
Upvotes: 0