sashank
sashank

Reputation: 1541

High speed string matching algorithms

I'm basically benchmarking some high speed string matching algorithms, I came across a few.

  1. Backwards Non-deterministic DAWG (Directed acyclic word graph) Matching algorithm by Gonzalo Navarro and Mathieu Raffinot. See "A Bit-Parallel Approach to Suffix Automata: Fast Extended String Matching"

  2. Horspool's improved version of the Boyer-Moore String searching algorithm. See "Practical fast searching in strings"

  3. Shift-Or algorithm with mismatches

  4. KMP

Are there any other better high speed string matching algorithms i can try ?

Edit : There is another thread in similar lines , which has good references too

Upvotes: 14

Views: 3058

Answers (3)

Deepak Kumar
Deepak Kumar

Reputation: 1

import java.util.Scanner;

public class StringMatch {

static int temp,i=0,j=0; static boolean flag=true,matcher=false;

    static String str=null,mstr=null;static char astr[],amstr[];

        static void getter(){
        Scanner sc = new Scanner(System.in);
        str = sc.nextLine();
        //String str="today is Monday"; 
        astr=str.toCharArray();
         mstr = sc.nextLine();
        //String mstr="is"; 
         amstr=mstr.toCharArray();
    }
    static void stringMatch(){
        while(i<astr.length){
            if(astr[i]==amstr[j]){
            while((j!=amstr.length)&&flag){temp=i;
                if(astr[i]!=amstr[j]) {flag=false;matcher=false;}
                else{matcher=true;}
                i++;j++;
                //System.out.println(i+"\t"+j);
            }if(matcher==true)break;i=temp;}i++;j=0;flag=true;

        }
        if(matcher==true) {System.out.println("true");}
        else    {System.out.println("false");}
    }
    public static void main(String[] args) {

    StringMatch.getter();
    StringMatch.stringMatch();

    }
}

Upvotes: -1

tmyklebu
tmyklebu

Reputation: 14205

Two-way string matching is, to my knowledge, the best general-purpose algorithm for string matching. It has linear worst-case complexity, uses constant space, and doesn't backtrack too much more than necessary. And the theory behind it is very nice.

If you know that your users aren't jerks, naive string matching optimised for your architecture will win for short "needles" while a Boyer-Moore variant will start really doing the sublinear thing for long "needles." However, naive string matching has a quadratic worst case and Boyer-Moore can be made to examine all characters in the input. The extra tables needed to handle mismatches actually carry a surprisingly severe penalty over two-way string matching.

Upvotes: 2

Thomas Ahle
Thomas Ahle

Reputation: 31604

You can also try

Upvotes: 0

Related Questions