user2559578
user2559578

Reputation: 153

Counting occurrences of edge color in prolog

I have a graph defined like this:

edge(a1,c1,7,blue).
edge(a1,d1,3,blue).
edge(a1,l1,8,red).
edge(b1,c1,11,blue).
edge(d1,e1,2,red).
....

and so on.

I know how to write code to check if there's a path and to calculate weight and Shortest path. But what I want to do is to output 2 integers: one for no.of blue and one for no.of red edges in the path from X to Y.

In essence, I want something like this:

 colour(X,Y,C).

that either returns:

"blue" if all the edges in the path are blue(or Red for all red)

"false" if there are mixed edges.

or something like:

blue:5 red:2 if there are 7 edges

I'm new to Prolog and it's very confusing how to create a "counter" type variable.

Upvotes: 2

Views: 426

Answers (3)

mat
mat

Reputation: 40768

I would like to post a few benchmarks that may help to compare the two nice solutions posted by @s0nata. This is an additional answer mostly because it is too long for a comment.

Prolog is a nice language to devise and run benchmarks, also because a large set of benchmarks can be readily described by a single query.

Ad hoc benchmarks

For example, let us benchmark option 1, which uses findall/3, using the following query:

?- length(_, E),
   N #= 2^E,
   portray_clause(E),
   forall(between(1, N, _), assertz(edge(x, y, 1, blue))),
   time(opt1),
   false.

Note that we are using assertz/1 to create ever more facts for edge/4. This is quite inelegant, and it means that we must restart Prolog between such runs or clear the database again in order to start from a clean slate. I leave making this more elegant as an exercise.

In any case, this results in the following output:

0.
'blue: '1
% 3,454 inferences, 0.001 CPU in 0.009 seconds (16% CPU, 2364134 Lips)
1.
'blue: '3
% 22 inferences, 0.000 CPU in 0.000 seconds (76% CPU, 785714 Lips)
2.
'blue: '7
% 26 inferences, 0.000 CPU in 0.000 seconds (74% CPU, 1181818 Lips)
3.
'blue: '15
% 34 inferences, 0.000 CPU in 0.000 seconds (77% CPU, 1478261 Lips)
4.
'blue: '31
% 50 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 1851852 Lips)

... lines omitted ...

21.
'blue: '4194303
% 4,194,322 inferences, 1.301 CPU in 1.439 seconds (90% CPU, 3224179 Lips)

For comparison, we get with option 2:

?- length(_, E),
   N #= 2^E,
   portray_clause(E),
   forall(between(1, N, _), assertz(edge(x, y, 1, blue))),
   time(opt2),
   false.
0.
'blue: '1
% 3,448 inferences, 0.001 CPU in 0.009 seconds (16% CPU, 2347175 Lips)
1.
'blue: '3
% 20 inferences, 0.000 CPU in 0.000 seconds (81% CPU, 769231 Lips)
2.
'blue: '7
% 32 inferences, 0.000 CPU in 0.000 seconds (80% CPU, 1142857 Lips)
3.
'blue: '15
% 56 inferences, 0.000 CPU in 0.000 seconds (84% CPU, 888889 Lips)
4.
'blue: '31
% 104 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 1350649 Lips)

... lines omitted ... 

21.
'blue: '4194303
% 12,582,920 inferences, 10.100 CPU in 10.305 seconds (98% CPU, 1245782 Lips)

More beautiful comparison

To reason more elegantly about such benchmarks, we will now slightly change opt1/0 and opt2/0 such that these predicates do not mess up our own benchmark output. That is, we will remove all side effects from these predicates, and thus also make testing these predicates a lot easier by making the actual counter available as an argument. We can write for example:

opt1(L) :-
    findall(_,edge(_,_,_,blue),ListBlue),
    length(ListBlue, L).

opt2(NBlue) :-
    retractall(counter(_,_)),
    asserta(counter(blue,0)),
    count_edges(blue, NBlue).

In addition, let us introduce the following auxiliary definition to obtain the wall time it takes to execute a goal. This may include slight perturbations due to other tasks, garbage collection etc., but importantly takes into account how much actual time (in seconds) it takes to execute the goal:

wall_time(Goal, T) :-
        statistics(walltime, [T0|_]),
        Goal,
        statistics(walltime, [T1|_]),
        T is (T1 - T0)/1000.

Let us now post for example:

?- format("~tE~t~5+~tNum~t~15+~tT1~10+~tT2~10+~n"),
   length(_, E),
   N #= 2^E,
   forall(between(1, N, _), assertz(edge(x, y, 1, blue))),
   wall_time(opt1(Num), T1),
   wall_time(opt2(Num), T2),
   format("~t~w~5+~t~w~15+~t~2f~10+~t~2f~10+~n", [E,Num,T1,T2]),
   false.

This results in the following table:

   E        Num              T1        T2
     0              1      0.00      0.00
     1              3      0.00      0.00
     2              7      0.00      0.00
     3             15      0.00      0.00
     4             31      0.00      0.00
     5             63      0.00      0.00
     6            127      0.00      0.00
     7            255      0.00      0.00
     8            511      0.00      0.00
     9           1023      0.00      0.00
    10           2047      0.00      0.01
    11           4095      0.00      0.01
    12           8191      0.00      0.03
    13          16383      0.01      0.04
    14          32767      0.02      0.08
    15          65535      0.02      0.16
    16         131071      0.04      0.32
    17         262143      0.09      0.64
    18         524287      0.18      1.31
    19        1048575      0.37      2.55
    20        2097151      0.74      5.14
    21        4194303      1.44     10.29

Please also take these actual benchmarks into account when comparing the two solutions. In Prolog, very often, impurities also cause a lot of inefficiency.

Note how we also use this opportunity to verify that both options yield the same result!

Upvotes: 3

s0nata
s0nata

Reputation: 214

For your edge/4 predicate there might be two more ways to count the number of edges of the same color:

1) using predicates for finding all solutions, like findall/3:

opt1 :-
    findall(_,edge(_,_,_,blue),ListBlue),   % collect all blue edges in ListBlue
    length(ListBlue,L),                     % find out the length of ListBlue
    display('blue: '),display(L),nl.        % print output

The _ symbol stands for an "anonymous variable", a variable for which bindings you do not care.

The disadvantage of this solution is that it is not resource-wise: first you pass through all edge/4 facts (inevitable cost), then you use additional memory to store the ListBlue list and then you have to traverse this list.

2) using side effects (see below) and dynamic predicates (predicates whore clauses can be added/deleted on run-time) in a failure-driven loop:

Side effects are some extra things that you can do when transitioning from one program state to another, like printing text (side effect) during an operation of assigning a value to a variable (direct effect).

:- dynamic counter/2.    % a dynamic predicate to be used as a mutable counter

opt2 :-
    retractall(counter(_,_)), % \_ delete any old value of a counter from previous
    asserta(counter(blue,0)), % /  runs and set it to 0 for the blue edges
    %
    count_edges(blue,NBlue),
    display('blue: '),display(NBlue),nl.

 count_edges(Col,_N) :-
    edge(_,_,_,Col),          % locate an edge of desired color Col
    counter(Col,N),           % \
    N1 is N + 1,              %  \_ update the counter value
    retract(counter(Col,N)),  %  /
    asserta(counter(Col,N1)), % /
    fail.                     % fail to force backtracking
count_edges(Col,N) :-
    counter(Col,N).           % read the current counter value

Forced backtracking allows for exploring all possible solutions to the edge(_,_,_,Col) query thus finding all blue edges.

I've tried the code online and it worked.

Upvotes: 5

CapelliC
CapelliC

Reputation: 60024

with the DB fragment you show, and library(aggregate):

?- aggregate(count,X^Y^W^edge(X,Y,W,C),N).
C = blue,
N = 3 ;
C = red,
N = 2.

Upvotes: 0

Related Questions