Pierre T.
Pierre T.

Reputation: 378

Boost::spirit illegal_backtracking exception

I use Boost.Spirit.Lex and .Qi for a simple calculator project and (as usual) it gives me some pain to debug and use. The debug prints:

<expression>
  <try>boost::spirit::multi_pass::illegal_backtracking

This exception is thrown and I can't understand why. I use macros in my code and it would be a pain to give a minimal example so I give the whole project. Just do "make" at the root, and then launch ./sash, a prompt will appear, if you want to test just do "-echo $5-8".

It seems that Google didn't find any similar problems about this exception...

The parser is in arithmetic/, and the call of the parser is at the end of arithmetic/evaluator.cpp

Any helps greatly appreciate.

Upvotes: 1

Views: 158

Answers (1)

sehe
sehe

Reputation: 392833

Your code is breaking because BOOST_SPIRIT_QI_DEBUG as well as the on_error<> handler seem to use iterators after they might have been invalidated.

To be honest, I'm not completely sure how this could happen.

Background

AFAICT lexertl uses spirit::multipass<> with a split_functor input policy and a split_std_deque storage policy [1].

Now, (luckily?) the checking policy is buf_id_check which means that the iterator will check for invalidation at the time of dereference.

Iterators are expected to be invalidated if

  1. the iterator is derefenced, growing the buffer to >16 tokens and the iterator is the only one referring to the shared state.
  2. or somewhere along the line clear_queue is called explicitely (e.g. from the flush_multi_path primitive in the Spirit Repository)

Honestly I don't see any of these two conditions being met. A quick and dirty

token_iterator_type clone = iter; // just to make it non-unique...

in evaluator.cpp doesn't make a difference (ruling out reason #1)

Temporary disabling the docheck implementation in the buf_id_check_policy made valgrind point out that on_error<> and BOOST_SPIRIT_DEBUG* are causing invalid memory references. Commenting both indeed makes all problems go away (and the eval_expression now works).

However, this is likely not your preferred solution.

Proposed solution

Since

  • you're working on a fixed, in-memory container representing the input you don't really need multi_pass behaviour emulation
  • you're using a trivial grammar, you don't really benefit from lexertl - while you are getting a lot of added complexity (as you can see)

I've quickly refactored some code: https://github.com/sehe/sash-refactor/commits/master

  • commit dec31496 sanity - lets do without macros
    4 files changed, 59 insertions(+), 146 deletions(-)

  • commit 6056574c dead code, excess scope, excess instantiation
    5 files changed, 38 insertions(+), 62 deletions(-)

  • commit 99d441db remove lexer
    9 files changed, 25 insertions(+), 177 deletions(-)

Now, you will find that your code is generally much simpler, also much shorter, not running into multi_pass limits and you can still have SPIRIT_DEBUG as well as on_error handling :) In the end

  • binary size in -g3 is reduced from 16Mb to 6.5Mb
  • a net 263 lines of code have been removed
  • more importantly, it works

Here's some samples (without debug output):

$ ./sash <<< '-echo $8-9'
    -1
    Warning: Empty environment variable "8-9".
$ ./sash <<< '-echo $8\*9'
    72
    Warning: Empty environment variable "8*9".
$ ./sash <<< '-echo $8\*(9-1)'
    64
    Warning: Empty environment variable "8*(9-1)".
$ ./sash <<< '-echo $--+-+8\*(9-1)'
    -64
    Warning: Empty environment variable "--+-+8*(9-1)".

[1] Which, despite it's name, buffers previously seen tokens in a std::vector<>

Upvotes: 3

Related Questions