Michael
Michael

Reputation: 105

recursive boost::xpressive using too much memory

Hi boost::xpressive users,

I'm getting a stack overflow error when trying to parse some decision trees with boost::xpressive. It seems to work for trees up to a certain size, but fails on 'big' trees, where 'big' seems to mean around 3000 nodes, and the stack with gdb gets to be 133979 frames deep. I'm thinking I need to optimize the regex somehow, but there's no .* anywhere so I'm not sure where to go from here.

#include <boost/regex.hpp>
#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>

using namespace boost::xpressive;
using namespace regex_constants;


sregex integral_number;
sregex floating_point_number;

sregex bid;
sregex ask;
sregex side;
sregex value_on_market_limit_ratio_gt;
sregex value_on_market_delta_ratio_gt;

sregex stdevs_from_mean_auction_time_gt;
sregex no_orders_on_opposite_side;
sregex is_pushing_price;
sregex is_desired;

sregex predicate, leaf, branch, tree;

integral_number = sregex_compiler().compile("[-+]?[0-9]+");
floating_point_number = sregex_compiler().compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?");
stdevs_from_mean_auction_time_gt = "StdevsFromMeanAuctionTimeGT(" >> floating_point_number >> ")";
side = sregex_compiler().compile("def::BID|def::ASK");
value_on_market_limit_ratio_gt = "ValueOnMarketLimitRatioGT<" >> side >> ">(" >> floating_point_number >> ")";
value_on_market_delta_ratio_gt = "ValueOnMarketDeltaRatioGT(" >> floating_point_number >> ")";
no_orders_on_opposite_side = sregex_compiler().compile("NoOrdersOnOppositeSide");
is_pushing_price = sregex_compiler().compile("IsPushingPrice");
is_desired = sregex_compiler().compile("IsDesired");
predicate = value_on_market_limit_ratio_gt | value_on_market_delta_ratio_gt | stdevs_from_mean_auction_time_gt | no_orders_on_opposite_side | is_pushing_price | is_desired;
leaf = sregex_compiler().compile("SEARCH_TO_MAX|AMEND_TO_AVAILABLE|AMEND_TO_AVAILABLE_MINUS_RECENT_ORDER_SIZE|AMEND_TO_CURRENT_MINUS_RECENT_ORDER_SIZE|SEARCH_BY_RECENT_ORDER_SIZE|PULL|DO_NOTHING");
branch = "Branch(" >> predicate >> "," >> by_ref(tree) >> "," >> by_ref(tree) >> ")";
tree = leaf | branch;

smatch what;
regex_match(s, what, tree)

Here, s is left undefined since it's a string of 75000 characters that doesn't fit in the question. How can I modify these expressions to make the match execute in less space?

Upvotes: 3

Views: 98

Answers (1)

Michael
Michael

Reputation: 105

I found how to fix this, changing the definition of branch to

branch = "Branch(" >> keep(predicate) >> "," >> keep(by_ref(tree)) >> "," >> keep(by_ref(tree)) >> ")";

In order to limit backtracking and thereby memory usage.

Upvotes: 4

Related Questions