Reputation:
I started to solve this problem in the imperative way and it works (DFS with traditional three coloring techniques). However, it takes me triple time to figure out how to do it Haskell and I failed! Suppose I represent graph as list (or map) of a node with its adjacency nodes.
type Node = Int
type Graph = [(Node, [Node])]
Note the above representation can be directed or undirected. I also pass the seen set and finished set as arguments(since no side effects are preferred in functional) when doing exploration to detect back track edge. However, I just can't do it in Haskell! I know there may be using State monad, but that thing hasn't come through my mind quite well neither. I am curious to know how could anyone guide me how to do it in "beautiful" Haskell style?
Upvotes: 14
Views: 4051
Reputation: 3020
There is the naive way of attempting it, that looks like this:
route :: Graph -> Label -> Label -> Bool
route g dest from | from == dest = True
route g dest from = any (route g dest) (neighbours g from)
But that fails at looping graphs. (I'm also assuming you have neighbours defined)
So, what to do but pass the list of already seen nodes through.
route2 :: Graph -> Label -> Label -> [Label] -> Bool
route2 g dest from seen
| dest == from = True
| otherwise = any (\x -> route2 g dest x (from:seen)) (neighbours g from)
But if you were running it on the graph here:
You would get a trace that looked something like this (excuse the scheme, I've shamelessly stolen these pictures from my cs class. fr is find-route, and fr-l is a version of it that takes a list. The second parameter is the accumulator)
As you can see, it ends up visiting the nodes K and H twice. This is bad, lets see why it's doing that.
Since it doesn't pass any information back up from the calls recursive in any
, it can't see what it did in branches that failed, only what was on the path to the current node.
Now to fix that, there are two paths we can take. My class took a continuation passing approach that is rather novel, so I will show it first, before the state monad version.
routeC :: Graph -> Label -> Label -> [Label] -> ([Label] -> Bool) -> Bool
routeC g dest from seen k
| dest == from = True
| from `elem` seen = k (from:seen)
| otherwise = routeCl g dest (neighbours g from) (from:seen) k
routeCl :: Graph -> Label -> [Label] -> [Label] -> ([Label] -> Bool) -> Bool
routeCl g dest [] seen k = k seen
routeCl g dest (x:xs) seen k =
routeC g dest x seen (\newSeen -> routeCl g dest xs newSeen k)
This uses a pair of functions, instead of any. routeC
just checks to see if we're arrived at the destination, or if we've looped, otherwise it just calls routeCL with the neighbours of the current node.
If we have looped, then instead of just returning False
, we call the continuation, but with the nodes that we've currently seen (including the current one).
routeCL
takes a list of nodes, and if the list is empty, runs the continuation, otherwise it does something interesting. It runs routeC
on the first node, and passes it a continuation that will run routeCl
on the rest of the list, with the NEW list of seen nodes. So it will be able to see into the history of the failed branches.
(As an additional thing, we can generalize this a bit further, and fully transform it into continuation passing style. I've generalized any as well, instead of using the pair of functions. This is optional, and the type signature is scarier than the code.)
anyK :: (a -> s -> (s -> r) -> (s -> r) -> r) ->
[a] -> s -> (s -> r) -> (s -> r) -> r
anyK p [] s tK fK = fK s
anyK p (x:xs) s tK fK = p x s tK (\s' -> anyK p xs s' tK fK)
routeK2 :: Graph -> Label -> Label -> ([Label] -> r) -> ([Label] -> r) -> r
routeK2 g dest from' trueK falseK = route from' [] trueK falseK
where route from seen tK fK
| from == dest = tK seen
| from `elem` seen = fK seen
| otherwise = anyK route (neighbours g from) (from:seen) tK fK
Same thing, but with more information being passed in.
Now, for what you've been waiting for, the State Monad version.
routeS :: Graph -> Label -> Label -> State [Label] Bool
routeS g dest from | dest == from = return True
routeS g dest from = do
seen <- get
if from `elem` seen then return False else do
put (from:seen)
anyM (routeS g dest) (neighbours g from)
But doesn't that last line look a lot like what we started with, just with some extra plumbing? Compare:
any (route g dest) (neighbours g from) -- Simple version
anyM (routeS g dest) (neighbours g from) -- State Version
anyK route (neighbours g from) (from:seen) tK fK -- CPS version
At the core, all three are doing the same thing. The monad in the state version just nicely handles the plumbing of the seen nodes for us. And the CPS version shows us exactly what the flow of control will be like, in a much more explicit fashion than the state monad.
Oh, but anyM
doesn't seem to be in the standard library. Here's what it looks like:
anyM :: (Monad m) => (a -> m Bool) -> [a] -> m Bool
anyM p [] = return False
anyM p (x:xs) = do
y <- p x
if y
then return True
else anyM p xs
Upvotes: 5
Reputation: 26167
First of all, there's a data type for storing Graphs in Haskell; it's called Data.Graph.Graph
in the containers
package. It uses a Data.Array
instead of a list, but is otherwise identical to your representation.
type Graph = Array Int [Int]
This representation leads to much more efficient graphs, while also using much less memory. I use this library like so:
import Data.Graph (Graph)
import qualified Data.Graph as Graph
import Data.Array
You presumably know the minimum and maximum nodes in your graph; if not, this function calculates them for you and creates a Graph
:
makeGraph :: [(Node, [Node])] -> Graph
makeGraph list =
array (minimum nodes, maximum nodes) list
where
nodes = map fst list
To see if a node is part of a cycle, one must check whether the nodes reachable from one node, excluding the node itself, contain that node. One can use the reachable
function to get the nodes that are reachable from a given node (including that node). Since a Graph
is an Array
, one can use assocs
to get back the list that it was built from, with type [(Node, [Node])]
. We use these three facts to build two functions:
-- | Calculates all the nodes that are part of cycles in a graph.
cyclicNodes :: Graph -> [Node]
cyclicNodes graph =
map fst . filter isCyclicAssoc . assocs $ graph
where
isCyclicAssoc = uncurry $ reachableFromAny graph
-- | In the specified graph, can the specified node be reached, starting out
-- from any of the specified vertices?
reachableFromAny :: Graph -> Node -> [Node] -> Bool
reachableFromAny graph node =
elem node . concatMap (Graph.reachable graph)
If you are interested in how the reachable
function works, I could go through all of it here, but it's fairly straight-forward to understand when you look at the code.
These functions are very efficient, but they could be vastly improved depending on how you want cycles to be represented in the end. You can for example use the stronglyConnComp
function in Data.Graph
to get a more streamlined representation.
Note that I'm abusing the fact that Node ~ Graph.Vertex ~ Int
in this case, so if your Node
s change type, you need to use appropriate conversion functions in Data.Graph
, like
graphFromEdges
, to get a Graph
and associated conversion functions.
The fgl
library is another alternative that also provides a complete suite of graph-related functionality that is extremely optimized.
Upvotes: 13
Reputation: 153162
I'd probably just cabal install fgl
and use the built-in DFS functions like components and similar.
Upvotes: 1