JAVA Graph/DFS implementation

Question

I have a small dilemma I would like to be advised with -

I'm implementing a graph (directed) and I want to make it extra generic - that is Graph where T is the data in the the node(vertex). To add a vertex to the graph will be - add(T t). The graph will wrap T to a vertex that will hold T inside.

Next I would like to run DFS on the graph - now here comes my dilemma - Should I keep the "visited" mark in the vertex (as a member) or initiate some map while running the DFS (map of vertex -> status)?

Keeping it in the vertex is less generic (the vertex shouldn't be familiar with the DFS algo and implementation). But creating a map (vertex -> status) is very space consuming.

What do you think?

Thanks a lot!

Sebastian · Accepted Answer

If you need to run algorithms, especially the more complex ones, you will quickly find that you will have to associate all kinds of data with your vertices. Having a generic way to store data with the graph items is a good idea and of course the access time for reading and writing that data should be O(1), ideally. Simple implementations could be to use HashMap, which have O(1) acess time for most cases, but the factor is relatively high.

For the yFiles Graph Drawing Library they added a mechanism where the data is actually stored at the elements themselves, but you can allocate as many data slots as you like. This is similar to managing an Object[] with each element and using the index into the data array as the "map". If your graph does not change, another strategy is to store the index of the elements in the graph with the elements themselves (just the integer) and then using that index to index into an array, where for each "data map" you have basically one array the size of the number of elements. Both techniques scale very well and provide the best possible access times, unless your data is really sparse (only a fraction of the elements actually need to store the data).

The "Object[] at Elements" approach:

In your vertex and edge class, add a field of type Object[] that is package private.
Implement a Map interface that provides T getData(Vertex) and void setData(Vertex, T)
One implementation of that interface could be backed by a HashMap but the one I was suggesting actually only stores an integer index that is used to index into the Object[] arrays at the vertices.
In your graph class add a method createMap that keeps track of the used and free indices and creates a new instance of the above class whose getter and setter implementations use the package private field of the Vertex class to actually access the data

The "One Array" approach:

Add a package private integer field to your Vertex class
Keep the integer fields in sync with the order of the vertices in your graph - the first Vertex has index 0, etc.
In the alternative map implementation, you initially allocate one T[] that has the size of the number of vertices.
In the getter and setter implementations you take the index of the Vertex and use that to access the values in the array.

For the DFS algorithm I would choose the "one array"-approach as you could use a byte[] (or if "Visited" is all that is required you could even use a BitSet) for space efficiency and you are likely to populate the data for all vertices in DFS if your graph is connected. This should perform a lot better than a HashMap based approach and does not require the boxing and unboxing for storing the data in the Object[].

JAVA Graph/DFS implementation

Answers (1)

Related Questions