Reputation: 79188
This question is about how to generate the symbolic state space for a symbolic model checker. First I go into some background which leads me to want to do this for MDDs, then I explain the question in more detail.
This lecture by Edmund M. Clarke (one of the founders of Model Checking) introduces Symbolic Model Checking. It says "industrial strength" model checkers use boolean encoding (Binary Decision Diagrams, or BDDs) to handle the state explosion problem. However, it only allows an ~order of magnitude more states than regular model checking. I skipped over regular model checking, which directly builds the state-transition graph of the program b/c I can imagine that being impractical immediately.
I have seen a few papers describing better-than BDD qualities, such as handling more states1 (avoiding?! the state-space explosion problem), general speedup2 (bounded model checking), using state matching techniques to limit the state space search (beyond bounded model checking)3, and using MDDs which perform several orders of magnitude faster than existing BDDs[4][5].
BDD's raised the number of states supported on average from about 10^6 to 10^20. That paper also says:
Unfortunately, symbolic techniques are known not to work well for asynchronous systems, such as communication protocols, which particularly suffer from state-space explosion.
So it seems that MDDs or even EDDs are the better choice for model checking. There is also edge–valued BDDs (EVBDDs). But my question is, then, how to construct the symbolic state space for ~MDD (or whichever is best). This document introduces it, but doesn't seem to explain how to actually generate it. They state:
We use quasi–reduced, ordered, non–negative edge–valued, multi–valued decision diagrams.
Wondering if one could explain the state space generation algorithm for MDDs at a high level, and what the data structures are that are involved in the system, such as what the properties of the node
object are (like a C struct). I am thinking if I can see the overall picture of what the data structures are and roughly how the algorithm works, that would be enough to be able to implement.
Also, I am not sure what needs to be done with the initial program and specification. So if the explanation could quickly describe/introduce how to generate any intermediate objects, that would be helpful. Adding this because I saw in one paper that they have the program in a Petri net form which they then convert to MDD, and I'm not sure how I would convert a program into a Petri net, or if that's even necessary. Basically, how to go from source code to MDD at a high level.
I think this image is the algorithm for state space generation, but I've been having difficulty understanding the details of it. Specifically, the data structures involved, and where the "state" is coming from, i.e. if these are boolean predicates from some other model or what.
This one seems close as well:
Here is more from the same paper:
(1) In this paper we show how boolean decision procedures, like Stalmarck’s Method [16] or the Davis & Putnam Procedure [7], can replace BDDs. This new technique avoids the space blow up of BDDs, generates counterexamples much faster, and sometimes speeds up the verification.
(2) The main result of the paper is an improvement of O(N) in the time complexity of checking time-bounded until-formulas, where N is the number of states in the CTMC under consideration.
(3) We build upon our previous work where we proposed a combination of symbolic execution and model checking for analyzing programs with complex inputs [14,19]. In that work we put a bound on the input size and (or) the search depth of the model checker. Here we look beyond bounded model checking and we study state matching techniques to limit the state space search. We propose a technique for checking when a symbolic state is subsumed by another symbolic state.
(4) We present a novel algorithm for generating state spaces of asynchronous systems using Multi-valued Decision Diagrams. In contrast to related work, we encode the next{state function of a system not as a single Boolean function, but as cross{products of integer functions. This permits the application of various iteration strategies to build a system's state space. In particular, we introduce a new elegant strategy, called saturation, and implement it in the tool SMART. On top of usually performing several orders of magnitude faster than existing BDD-based state-space generators, our algorithm's required peak memory is often close to the final memory needed for storing the overall state space.
(5) Symbolic techniques based on Binary Decision Diagrams (BDDs) are widely employed for reasoning about temporal properties of hardware circuits and synchronous controllers. However, they often perform poorly when dealing with the huge state spaces underlying systems based on interleaving semantics, such as communications protocols and distributed software, which are composed of independently acting subsystems that communicate via shared events... This article shows that the efficiency of state–space exploration techniques using decision diagrams can be drastically improved by exploiting the interleaving semantics underlying many event–based and component–based system models. A new algorithm for symbolically generating state spaces is presented that (i) encodes a model’s state vectors with Multi–valued Decision Diagrams (MDDs) rather than flattening them into BDDs.
Getting closer with this:
The reachable state space X_reach can be characterized as the smallest solution of the fixpoint equation X ⊆ X_init ∪ T(X). Algorithm
Bfs
implements exactly this fixpoint computation, where sets and relations are stored using L-level and 2L-level MDDs, respectively, i.e., node p encodes the set X_p having characteristic function v_p satisfyingv_p(i_L,...,i_1) = 1 ⇔ (i_L,...,i_1) ∈ X_p.
The union of sets is simply implemented by applying the Or operator to their characteristic functions, and the computation of the states reachable in one step is implemented by using function RelProd (of course, the MDD version of these functions must be employed if MDDs are used instead of BDDs). Since it performs a breadth-first symbolic search, algorithm Bfs halts in exactly as many iterations as the maximum distance of any reachable state from the initial states.
mdd Bfs(mdd Xinit) is
local mdd p;
p ← Xinit;
repeat
p ← Or(p, RelProd(p, T ));
until p does not change;
return p;
bdd Or(bdd a, bdd b) is
local bdd r, r0, r1;
local level k;
if a = 0 or b = 1 then return b;
if b = 0 or a = 1 then return a;
if a = b then return a;
if Cache contains entry hORcode, {a, b} : ri then return r;
if a.lvl < b.lvl then
k ← b.lvl;
r0 ← Or(a, b[0]);
r1 ← Or(a, b[1]);
else if a.lvl > b.lvl then
k ← a.lvl;
r0 ← Or(a[0], b);
r1 ← Or(a[1], b);
else • a.lvl = b.lvl
k ← a.lvl;
r0 ← Or(a[0], b[0]);
r1 ← Or(a[1], b[1]);
r ← UniqueTableInsert(k, r0, r1);
enter hORcode, {a, b} : ri in Cache;
return r;
bdd RelProd(bdd x, bdd2 t) is • quasi-reduced version
local bdd r, r0, r1;
if x = 0 or t = 0 then return 0;
if x = 1 and t = 1 then return 1;
if Cache contains entry hRELPRODcode, x, t : ri then return r;
r0 ← Or(RelProd(x[0], t[0][0]), RelProd(x[1], t[1][0]));
r1 ← Or(RelProd(x[0], t[0][1]), RelProd(x[1], t[1][1]));
r ← UniqueTableInsert(x.lvl, r0, r1);
enter hRELPRODcode, x, t : ri in Cache;
mdd Saturation(mdd Xinit) is
return Saturate(L, Xinit);
mdd Saturate(level k, mdd p) is
local mdd r, r0, ..., rnk−1;
if p = 0 then return 0;
if p = 1 then return 1;
if Cache contains entry hSATcode, p : ri then return r;
for i = to nk − 1 do
ri ← Saturate(k − 1, p[i]);
repeat
choose e ∈ Ek, i, j ∈ Xk, such that ri 6= 0 and Te[i][j] 6= 0;
rj ← Or(rj , RelProdSat(k − 1, ri, Te[i][j]));
until r0, ..., rnk−1 do not change;
r ← UniqueTableInsert(k, r0, ..., rnk−1);
enter hSATcode, p : ri in Cache;
return r;
mdd RelProdSat(level k, mdd q, mdd2 f) is
local mdd r, r0, ..., rnk−1;
if q = 0 or f = 0 then return 0;
if Cache contains entry hRELPRODSATcode, q, f : ri then return r;
for each i, j ∈ Xk such that q[i] 6= 0 and f[i][j] 6= 0 do
rj ← Or(rj , RelProdSat(k − 1, q[i], f[i][j]));
r ← Saturate(k, UniqueTableInsert(k, r0, ..., rnk−1));
enter hRELPRODSATcode, q, f : ri in Cache;
return r.
Upvotes: 2
Views: 380
Reputation: 2075
To answer concisely, it is not straightforward to encode an arbitrary transition relation in DD form. As you noticed for Petri nets it's pretty easy, general case is something else (assignment, arbitrary expressions, using indexes) + the issues due to a program generally needing variable length states + recursion/stack state modeling.
The original propositions all involve encoding the transition relation R as a subset of SxS, so a transition t from s->s' is possible if the pair (s,s') is in R. A special product operation between this DD and the DD for a set of states is performed, yielding the successors at one step. But the papers by Ciardo et al. you're reading are more advanced and typically use MxD/identity reduced forms, so that variables not impacted by a transition (don't cares) can be skipped in the encoding. Still, they end up with a DD that has two variables (before and after) for each state variable, so still subsets of SxS.
So starting from a program, you'd usually want to get rid of recursion/stack, bound the number of variables (so you can work with an array a bit too large for most states), make all the variables discrete (e.g. integers).
If you can get a model in this form, but still have complex operations such as arithmetic and assignments (i.e. you cannot write your problem as a Petri net) the most advanced packages are probably
Upvotes: 0