Dror
Dror

Reputation: 7303

regular expression "contains" another regular expression

Is there a way to test if a regular expression "contains" another regular expression?
For example:

RegEX1 = "a.*b";
RegEx2 = "a1.*b";

RegEX1 "contains" RegEX2.

As far as I know - this can't be done, am I wrong?


OK, joel.neely has shown that it can be done (haven't read it yet...) academically.

Can it be done in a programming language, say C# ?
How effective will that be? How long will it take to test 1000 pairs?

Upvotes: 8

Views: 896

Answers (2)

Svend
Svend

Reputation: 8158

Converting the two expressions to the equivalent state machines, and checking all paths in both machines allow the same matches, should do the trick. The pumping lemme should obviously be minded, so avoid revisiting old nodes.

It would only work for "simple" regular expressions (or real, what have you, perls recursive expressions are much more expressive).

While a graph of the state machine could have a large number of paths, it should still be limited (esp if the source for the expressions are human). So you'd find all the allowable paths of RegEX1, and check, each in turn, if it's allowable in RegEX2. If all paths are valid, you'd know that the one is contained in the other.

Upvotes: 0

joel.neely
joel.neely

Reputation: 30943

Yes.

This paper contains a detailed discussion of the topic (see section 4.4).

Upvotes: 6

Related Questions