yegor256
yegor256

Reputation: 105053

Is it possible to convert ANTLR3 grammar into regular expression?

I have an ANTLR3 simple grammar parser that takes short lines of text and convert them to Java objects. Next, I have a big list of text lines. Some of them (less than 1%) can be converted because they match the grammar.

I need to pass all of them through the parser in order to understand which are convertible, and create a collection of Java objects. Very time-consuming operation. Would be much more effective to pass them through a regular expression before sending to ANTLR3.

I can create such a regex myself, but would be much better to get it dynamically from ANTLR3 parser. Is it possible to do?

Upvotes: 0

Views: 374

Answers (2)

Matthew Sowders
Matthew Sowders

Reputation: 1680

To be specific of ANTLR can parse LL(*) subclass of context free grammars.

To tell if your language is Regular, see if you can apply the pumping lemma for regular languages.

Upvotes: 1

Paŭlo Ebermann
Paŭlo Ebermann

Reputation: 74750

In general, only regular languages can have a regular expression describing them. (Most RE-engines support features that match also some non-regular languages (for example by back-references), but still not all context-free or even general formal languages.)

I'm not really an antlr-expert, but I suppose its grammars can match languages which are not RE-matchable.

So, even the theoretical solvability of your problem is not really given, it depends on the grammar.

It may be that

  • your grammar is in fact a regular grammar, or
  • that there is a regular language which is a slight superset of your grammars language - thus the RE can filter out most non-matching lines, while some will only be filtered out by the grammar/parser.

There most probably is no certain way to be sure without seeing your grammar.

Also, a regular expression is not necessarily faster than your parser would be.

Upvotes: 4

Related Questions