How to use RegEx to determine which character is allowed next

Question

How can I use Regex to determine which character is allowed as next input?

I want to process a sequential input from a user and permit or prohibit the input of the next character whether its legal or not.

For example given the following RegEx: ^\d{0,1}\d:\d\d$ it matches strings like 12:34, 1:23, it works if the whole string is given but if the chars are typed in one after another. I can't determine whether the substring matches the regex this far or not.

Given the substring 1 I would like to determine that the next character should be [0-9] or :.

How can this be achieved?

Thanks for any replies!

FreeG · Accepted Answer

Thank you all for you answers, i learned something new about regex but it didn't serve my needs. Maybe i wasn't specific enough about this question.

What i really wanted was to process chars sequential. I wanted a regex engine where i can pass an arbitary regex pattern and query if the next user input will be valid (based on all previous inputs) and i wanted to to be able to retrieve the characterset that is possible for the next char for auto-completion mechanisms

//pseudo code

void main(string[] args){
    Regex regex = new Regex("^1(2|3)4$");
    RegexProcessor processor = new RegexProcessor(regex);

    bool step1 = processor.Input('1'); //return true and iterates to next step
    char[] validInput = processor.GetValidInput(); //returns new char[]{'2','3'}

    bool step2 = processor.Input('4'); //return false because on step2 (2|3) is accepted
}

Solution: Get a DFA/NFA based regex engine.
I used https://github.com/moodmosaic/Fare It is state based and every state exposes transitions from where you can get the chars which are valid to iterate to the next state. Implement a runner that maintains state and enables you to iterate step by step through the input text. Look at BasicOperation.Run(Automation a,string s) for an example how to implement a IsMatch with this Library.

Why can't you use the standard Regex class
The standard lib focuses on the goal to be efficient and to allow powerful regex features. The state of the art is to implement a pattern iteration approach with backtracking rather than a text iteration approach, which has good reasons. Further it compiles the regex pattern to some kind of machine instructions so it can be executed very fast. You see there is no chance to hook in and process it step by step. This is why you need a DFA/NFA based approach which probably won't be that fast but has other strengths

How to use RegEx to determine which character is allowed next

Answers (2)

Related Questions