How to Compute the Reachable Symbolic State Space for a Binary Decision Diagram

Question

This question is about how to generate the symbolic state space for a symbolic model checker. First I go into some background which leads me to want to do this for MDDs, then I explain the question in more detail.

This lecture by Edmund M. Clarke (one of the founders of Model Checking) introduces Symbolic Model Checking. It says "industrial strength" model checkers use boolean encoding (Binary Decision Diagrams, or BDDs) to handle the state explosion problem. However, it only allows an ~order of magnitude more states than regular model checking. I skipped over regular model checking, which directly builds the state-transition graph of the program b/c I can imagine that being impractical immediately.

I have seen a few papers describing better-than BDD qualities, such as handling more states1 (avoiding?! the state-space explosion problem), general speedup2 (bounded model checking), using state matching techniques to limit the state space search (beyond bounded model checking)3, and using MDDs which perform several orders of magnitude faster than existing BDDs[4][5].

BDD's raised the number of states supported on average from about 10^6 to 10^20. That paper also says:

Unfortunately, symbolic techniques are known not to work well for asynchronous systems, such as communication protocols, which particularly suffer from state-space explosion.

So it seems that MDDs or even EDDs are the better choice for model checking. There is also edge–valued BDDs (EVBDDs). But my question is, then, how to construct the symbolic state space for ~MDD (or whichever is best). This document introduces it, but doesn't seem to explain how to actually generate it. They state:

We use quasi–reduced, ordered, non–negative edge–valued, multi–valued decision diagrams.

Wondering if one could explain the state space generation algorithm for MDDs at a high level, and what the data structures are that are involved in the system, such as what the properties of the node object are (like a C struct). I am thinking if I can see the overall picture of what the data structures are and roughly how the algorithm works, that would be enough to be able to implement.

Also, I am not sure what needs to be done with the initial program and specification. So if the explanation could quickly describe/introduce how to generate any intermediate objects, that would be helpful. Adding this because I saw in one paper that they have the program in a Petri net form which they then convert to MDD, and I'm not sure how I would convert a program into a Petri net, or if that's even necessary. Basically, how to go from source code to MDD at a high level.

I think this image is the algorithm for state space generation, but I've been having difficulty understanding the details of it. Specifically, the data structures involved, and where the "state" is coming from, i.e. if these are boolean predicates from some other model or what.

This one seems close as well:

Here is more from the same paper:

(1) In this paper we show how boolean decision procedures, like Stalmarck’s Method [16] or the Davis & Putnam Procedure [7], can replace BDDs. This new technique avoids the space blow up of BDDs, generates counterexamples much faster, and sometimes speeds up the verification.

(2) The main result of the paper is an improvement of O(N) in the time complexity of checking time-bounded until-formulas, where N is the number of states in the CTMC under consideration.

(3) We build upon our previous work where we proposed a combination of symbolic execution and model checking for analyzing programs with complex inputs [14,19]. In that work we put a bound on the input size and (or) the search depth of the model checker. Here we look beyond bounded model checking and we study state matching techniques to limit the state space search. We propose a technique for checking when a symbolic state is subsumed by another symbolic state.

(4) We present a novel algorithm for generating state spaces of asynchronous systems using Multi-valued Decision Diagrams. In contrast to related work, we encode the next{state function of a system not as a single Boolean function, but as cross{products of integer functions. This permits the application of various iteration strategies to build a system's state space. In particular, we introduce a new elegant strategy, called saturation, and implement it in the tool SMART. On top of usually performing several orders of magnitude faster than existing BDD-based state-space generators, our algorithm's required peak memory is often close to the final memory needed for storing the overall state space.

(5) Symbolic techniques based on Binary Decision Diagrams (BDDs) are widely employed for reasoning about temporal properties of hardware circuits and synchronous controllers. However, they often perform poorly when dealing with the huge state spaces underlying systems based on interleaving semantics, such as communications protocols and distributed software, which are composed of independently acting subsystems that communicate via shared events... This article shows that the efficiency of state–space exploration techniques using decision diagrams can be drastically improved by exploiting the interleaving semantics underlying many event–based and component–based system models. A new algorithm for symbolically generating state spaces is presented that (i) encodes a model’s state vectors with Multi–valued Decision Diagrams (MDDs) rather than flattening them into BDDs.

Getting closer with this:

The reachable state space X_reach can be characterized as the smallest solution of the fixpoint equation X ⊆ X_init ∪ T(X). Algorithm Bfs implements exactly this fixpoint computation, where sets and relations are stored using L-level and 2L-level MDDs, respectively, i.e., node p encodes the set X_p having characteristic function v_p satisfying

v_p(i_L,...,i_1) = 1 ⇔ (i_L,...,i_1) ∈ X_p.

The union of sets is simply implemented by applying the Or operator to their characteristic functions, and the computation of the states reachable in one step is implemented by using function RelProd (of course, the MDD version of these functions must be employed if MDDs are used instead of BDDs). Since it performs a breadth-first symbolic search, algorithm Bfs halts in exactly as many iterations as the maximum distance of any reachable state from the initial states.

mdd Bfs(mdd Xinit) is
local mdd p;
  p ← Xinit;
  repeat
    p ← Or(p, RelProd(p, T ));
  until p does not change;
  return p;

bdd Or(bdd a, bdd b) is
local bdd r, r0, r1;
local level k;
  if a = 0 or b = 1 then return b;
  if b = 0 or a = 1 then return a;
  if a = b then return a;
  if Cache contains entry hORcode, {a, b} : ri then return r;
  if a.lvl < b.lvl then
    k ← b.lvl;
    r0 ← Or(a, b[0]);
    r1 ← Or(a, b[1]);
  else if a.lvl > b.lvl then
    k ← a.lvl;
    r0 ← Or(a[0], b);
    r1 ← Or(a[1], b);
  else • a.lvl = b.lvl
    k ← a.lvl;
    r0 ← Or(a[0], b[0]);
    r1 ← Or(a[1], b[1]);
  r ← UniqueTableInsert(k, r0, r1);
  enter hORcode, {a, b} : ri in Cache;
  return r;

bdd RelProd(bdd x, bdd2 t) is • quasi-reduced version
local bdd r, r0, r1;
  if x = 0 or t = 0 then return 0;
  if x = 1 and t = 1 then return 1;
  if Cache contains entry hRELPRODcode, x, t : ri then return r;
  r0 ← Or(RelProd(x[0], t[0][0]), RelProd(x[1], t[1][0]));
  r1 ← Or(RelProd(x[0], t[0][1]), RelProd(x[1], t[1][1]));
  r ← UniqueTableInsert(x.lvl, r0, r1);
  enter hRELPRODcode, x, t : ri in Cache;


mdd Saturation(mdd Xinit) is
  return Saturate(L, Xinit);

mdd Saturate(level k, mdd p) is
local mdd r, r0, ..., rnk−1;
  if p = 0 then return 0;
  if p = 1 then return 1;
  if Cache contains entry hSATcode, p : ri then return r;
  for i = to nk − 1 do
  ri ← Saturate(k − 1, p[i]);
  repeat
  choose e ∈ Ek, i, j ∈ Xk, such that ri 6= 0 and Te[i][j] 6= 0;
  rj ← Or(rj , RelProdSat(k − 1, ri, Te[i][j]));
  until r0, ..., rnk−1 do not change;
  r ← UniqueTableInsert(k, r0, ..., rnk−1);
  enter hSATcode, p : ri in Cache;
  return r;

mdd RelProdSat(level k, mdd q, mdd2 f) is
local mdd r, r0, ..., rnk−1;
  if q = 0 or f = 0 then return 0;
  if Cache contains entry hRELPRODSATcode, q, f : ri then return r;
  for each i, j ∈ Xk such that q[i] 6= 0 and f[i][j] 6= 0 do
  rj ← Or(rj , RelProdSat(k − 1, q[i], f[i][j]));
  r ← Saturate(k, UniqueTableInsert(k, r0, ..., rnk−1));
  enter hRELPRODSATcode, q, f : ri in Cache;
  return r.

How to Compute the Reachable Symbolic State Space for a Binary Decision Diagram

Answers (1)

Related Questions