jojo hohanon
jojo hohanon

Reputation: 131

What is it called to rule out a substring from matching a regex

I will start with specific examples since I lack the term I am looking for.

Let's start with the regex [^x]xyx[^x].

The regex matches string abcxyxdef. The regex begins to match string abcx, and might succeed depending on what the next (unknown) character is. Similarly the regex might match yxdef, depending on what the preceding character is. The regex might match abc as well, as a prefix or suffix of xyx.

(I am aware I'm being sloppy with "matches" here; this is called "search" in many libraries. anyways)

In all of the above cases, the input string was possibly matched by the regex. Or somewhat more precisely: the string could possibly intersect with the part of the input matched by the regex.

Contrast this with the string xxyx. This cannot be part of a match of the regex. xx neither. The string xxyxaxy could however, by starting the match on the a.

determining this is a straight-forward application of the DFA expansion of the regex:

sketchdef possible(input, DFA): 
  # case: we start matching middle of input
  for each character position in input
    walk the DFA from entry node with rest of input
    return true if we run out of input avoiding the reject node, 
    return trye if we exited DFA
  # case: we started matching before input started
  for each node in DFA
    walk the DFA from that node with all of input
    return true if we run out of input avoiding the reject node, 
    return trye if we exited DFA
  return false

I am looking for the precise name of this property as I might find it in literature. Or if you are a subject matter expert, I'd be satisfied with the assertion that properties this trivial don't have any name in literature.

possible topics: language theory, computer science, regular languages

I searched all the google; flipped through sipser.

Upvotes: 0

Views: 22

Answers (0)

Related Questions