roryok
roryok

Reputation: 9645

Whats the best way to optimise my regex matching

I've got an app with a textbox in it. The user enters text into this box.

I have the following function triggered in an OnKeyUp event in that textbox

private void bxItemText_KeyUp(object sender, System.Windows.Input.KeyEventArgs e)
{               
    // rules is an array of regex strings
    foreach (string rule in rules)
    {
        Regex regex = new Regex(rule);
        if (regex.Match(text.Trim().ToLower()))
        {
            // matched rule is a variable
            matchedRule = rule;
            break;
        }
    }
}

I've got about 12 strings in rules, although this can vary.

Once the length of the text in my textbox gets bigger than 80 characters performance starts to degrade. typing a letter after 100 characters takes a second to show up.

How do I optimise this? Should I match on every 3rd KeyUp ? Should I abandon KeyUp altogether and just auto match every couple of seconds?

Upvotes: 1

Views: 510

Answers (7)

Rob
Rob

Reputation: 4477

Given that you seem to be matching keywords, can you just perform the match on the current portion of text that's been edited (i.e. in the vicinity of the cursor)? Might be tricky to engineer, especially for operations like paste or undo, but scope for a big performance gain.

Upvotes: 0

Rob
Rob

Reputation: 4477

Pre-compile your regexes (using RegexOptions.Compiled). Also, can you make the Trim and ToLower redundant by expanding your regex? You're running Trim and ToLower once for each rule, which is inefficient even if you can't eliminate them altogether

You can try and make your rules mutually exclusive - this should speed things up. I did a short test: matching against the following

"cat|car|cab|box|balloon|button"

can be sped up by writing it like this

"ca(t|r|b)|b(ox|alloon|utton)"

Upvotes: -1

ΩmegaMan
ΩmegaMan

Reputation: 31686

You don't need to create a new regex object each time. Also using static call will cache the pattern if used before (since .Net 2). Here is how I would rewrite it

matchedRule = Rules.FirstOrDefault( rule => Regex.IsMatch(text.Trim().ToLower(), rule));

Upvotes: 0

Nikita
Nikita

Reputation: 422

If you need pattern determination for Each new symbol, and you care about performance, than Final State Machine seems to be the best option... That is much harder way. You should specify for each symbol list of next symbols, that are allowed. And OnKeyUp you just walk on next state, if possible. And you will have the amount of patterns, that input text currently matches. Some useful references, that I could found:

FSM example

Guy explaining how to convert Regex to FSM

Regex - FSM converting discussion

Upvotes: 0

polkduran
polkduran

Reputation: 2551

Use static method calls instead of create a new object each time, static calls use a caching feature : Optimizing Regular Expression Performance, Part I: Working with the Regex Class and Regex Objects.

That will be a major improvement in performance, then you can provide your regexes (rules) to see if some optimization can be done in the regexes.

Other resources :

Upvotes: 1

Nikita
Nikita

Reputation: 422

Combining strings to one on Regex level will work faster than foreach in code. Combining two Regex to one

Upvotes: 0

Ibrahim Najjar
Ibrahim Najjar

Reputation: 19423

How do I optimise this? Should I match on every 3rd KeyUp ? Should I abandon KeyUp altogether and just auto match every couple of seconds?

I would go with the second option, that is abandon KeyUp and trigger the validation every couple of seconds, or better yet trigger the validation when the TextBox loses focus.

On the other hand, I should suggest to cache the regular expressions beforehand and compile them because it seems like you are using them over and over again, in other words instead of storing the rules as strings in that array, you should store them as compiled regular expression objects when they are added or loaded.

Upvotes: 4

Related Questions