Reputation: 4356
For example, given the grammar
Expr -> Number | Number '+' Expr
Number -> [1-9][0-9]*
we see that for + 1
there exists a sentence (e.g. 1 + 1
) which is parsed by the grammar and for which + 1
is sub-sentence. Is there a general algorithm for this? I think that if we can put some kind of flags in the parser we should be able to tell it to skip some initial and some final tokens while parsing but I'm not sure whether this would work. Any ideas?
Upvotes: 3
Views: 142
Reputation: 65468
I'm sure there's a better way, but this argument shows that the class of context-free languages is closed under this operation via a linear-time transformation of a Chomsky normal form grammar.
The idea is to introduce, for each nonterminal symbol A, three other symbols Apre, Asuf, Asub, which match prefixes, suffixes, and substrings of strings matched by A.
For transitions A -> s, add
Apre ->
Apre -> s
Asuf ->
Asuf -> s
Asub ->
Asub -> s.
For transitions A -> B C, add
Apre -> Bpre
Apre -> B Cpre
Asuf -> Csuf
Asuf -> Bsuf C
Asub -> Bsub
Asub -> Csub
Asub -> Bsuf Cpre.
Change the start symbol from S to Ssub.
Upvotes: 3