Rauf
Rauf

Reputation: 27

Formalization of Reachability in z3py

I am trying to formalize reachability property, such that I can find a satisfiable solution if there exists, a path of length 4 (which means if there are 3 intermediate nodes through which, 2 desired nodes are connected). Note: I am restricting it to only find a path of length 4. Following is what I did:

from z3 import*
# link between two nodes
L= Function('L', IntSort(), IntSort(),BoolSort())

# path between two nodes
p=Function('p', IntSort(), IntSort(), BoolSort())

u=Int('u')
x=Int('x')
y=Int('y')
z=Int('z')
i=Int('i')
j=Int('j')

s=Solver()

s.add(x>=0, x<=5, y>=0,y<=5, u>=0,u<=5,i>=0,i<=5, j>=0,j<=5,z>=0,z<=5)

# no self link or path
si1= ForAll(i,  And (L(i,i)==False,p(i,i)==False))

si2=ForAll([x,y], p(x,y)==p(y,x))

# To fix source and destination
si3= ForAll([u,x,y], And(L(u,x)==False, L(y,u)==False) ) 

# To fix the length of path
si4=ForAll([x,y], Exists([i,j,z] , And( L(x,i)==True, L(i,j)==True,L(j,z)==True,L(z,y)==True) )   )

si5=ForAll([x,y,z], Implies (And (p(x,y)==True, L(x,z)==False,L(y,z)==False), L(x,y)==True))

#s.add(L(1,2)==True,L(2,3)==True, L(3,4)==True,L(4,5)==True)

s.add( Implies (p(x,y)==True, Exists([x,y], And(si1==True,si2==True,si3==True,si4==True,si5==True))) )

#s.add(p(x,y)==True)

result=s.check()
model = s.model()

print result, model ,model[p].else_value()

It returns me SAT, but now the problem is that, If I un-comment/add s.add(p(x,y)==True) It returns me unsat, rather it should return me a SAT result. Because there exist a path between x and y. Even if I manually mention (which is a naive approach)

s.add(L(1,2)==True,L(2,3)==True, L(3,4)==True,L(4,5)==True)

it still returns me x=0 and y=3, where as this is not a satisfiable solution, the first satisfiable solution should be x=0 and y=4 as per

si4=ForAll([x,y], Exists([i,j,z] , And( L(x,i)==True, L(i,j)==True,L(j,z)==True,L(z,y)==True) )   )

Where am I wrong?

Upvotes: 0

Views: 331

Answers (2)

Rauf
Rauf

Reputation: 27

I was able to update my formalization as follows, now I get a satisfiable solution but still I am unable find distinct values for intermediate nodes: i,j,z

from z3 import*

L= Function('L', IntSort(), IntSort(),BoolSort())
p=Function('p', IntSort(), IntSort(), BoolSort())

u=Int('u')
x=Int('x')
y=Int('y')
z=Int('z')
i=Int('i')
j=Int('j')

s=Solver()

s.add(And(x>=0, x<=5, y>=0,y<=5, u>=0,u<=5,i>=0,i<=5, j>=0,j<=5,z>=0,z<=5))

#si_1
s.add(ForAll(x,  And( L(x,x)==False,p(x,x)==False )))

#si_2
s.add(ForAll([x,y], And(L(x,y)==L(y,x), p(x,y)==p(y,x))))


#si_3
s.add(ForAll([x,y], Implies(p(x,y)==True,  Exists([i,j,z], And( L(x,i)==True, L(i,j)==True,L(j,z)==True,L(z,y)==True)))))


#si_4
s.add(ForAll([x,y,z], Implies (And (p(x,y)==True, L(x,z)==False,L(y,z)==False), L(x,y)==True) ))


s.add(p(x,y)==True)




result=s.check()
model = s.model()

print result, model

As per the constraint si_3:

    s.add(ForAll([x,y], Implies(p(x,y)==True,  Exists([i,j,z], And( L(x,i)==True, L(i,j)==True,L(j,z)==True,L(z,y)==True)))))

I should get distinct values for i, j, and z as si_1 ensures that L(x,x) would be false:

s.add(ForAll(x,  And( L(x,x)==False,p(x,x)==False )))

What should I change so that I can get distinct values of i, j, z and they should not be same...

Upvotes: 0

Tim
Tim

Reputation: 1098

I'll try to explain why this transitions from sat to unsat but ignore how to encode reachability. Briefly, the problem is that si3 and si5 entail that p(x,y) is false for all x and y. When you uncomment/assert "p(x,y)==True", this asserts both si3 and si5 and leads to a contradiction.

In more detail, lets assert "p(x,y)==True". It must be the case that s1 through s5 hold from:

s.add( Implies (p(x,y)==True, Exists([x,y], And(si1==True,si2==True,si3==True,si4==True,si5==True))) )
; |- Exists([x,y], And(si1==True,si2==True,si3==True,si4==True,si5==True))
; |- And(si1==True,si2==True,si3==True,si4==True,si5==True) as the existential bindings for x and y are not instantiated in any subformula.

We can break up si3 to deduce that it is equivalent to all L(x,y) are false.

si3= ForAll([u,x,y], And(L(u,x)==False, L(y,u)==False) ) 
; moving the forall across the and we get
And (ForAll([u,x,y], L(u,x)==False), ForAll([u,x,y], L(y,u)==False) ) 
; dropping unused variables
And(ForAll([u,x], L(u,x)==False), ForAll([u,y], L(y,u)==False) ) 
; which are the same statement up to variable renaming. So
ForAll([x,y], L(x,y)==False)

Using s3, we can replace all instances of L(-,-) in s5 to get:

s5=ForAll([x,y,z], Implies (And (p(x,y)==True, L(x,z)==False,L(y,z)==False), L(x,y)==True))
; replacing L(-,-) with False and reducing False==False and False==True gives:
ForAll([x,y,z], Implies (And (p(x,y)==True, True, True), False))
; which is equivalent to
ForAll([x,y], p(x,y)==False)

This contradicts p(x,y)==True. (Note this is a different binding of x and y than the ones in the quantifiers.)

One satisfying model is "u = x = y = z = i = j = 0" and "p(0,0)=False". Once "p(x,y)" is false, s1 through s5 can be ignored for satisfying the formula. (I am being a bit loose by saying "model" here, but hopefully the intuition is clear.)

As a general suggestion, I would suggest renaming variables that are bound by quantifiers to remove ambiguity. There are 3 different copies of "x" and "y" hanging around in the following line.

s.add( Implies (p(x,y)==True, Exists([x,y], And(si1==True,si2==True,si3==True,si4==True,si5==True))) )

There is the top level unbound x and y in "p(x,y)", the version bound by the "Exists([x,y],...)" (which don't have any instance), and the versions bound in si1 through si5. Maybe you can keep track of this, but I find it hard.

Upvotes: 0

Related Questions