Reputation: 1397
I have a small dilemma I would like to be advised with -
I'm implementing a graph (directed) and I want to make it extra generic - that is Graph where T is the data in the the node(vertex). To add a vertex to the graph will be - add(T t). The graph will wrap T to a vertex that will hold T inside.
Next I would like to run DFS on the graph - now here comes my dilemma - Should I keep the "visited" mark in the vertex (as a member) or initiate some map while running the DFS (map of vertex -> status)?
Keeping it in the vertex is less generic (the vertex shouldn't be familiar with the DFS algo and implementation). But creating a map (vertex -> status) is very space consuming.
What do you think?
Thanks a lot!
Upvotes: 3
Views: 867
Reputation: 8005
If you need to run algorithms, especially the more complex ones, you will quickly find that you will have to associate all kinds of data with your vertices. Having a generic way to store data with the graph items is a good idea and of course the access time for reading and writing that data should be O(1), ideally. Simple implementations could be to use HashMap, which have O(1) acess time for most cases, but the factor is relatively high.
For the yFiles Graph Drawing Library they added a mechanism where the data is actually stored at the elements themselves, but you can allocate as many data slots as you like. This is similar to managing an Object[]
with each element and using the index into the data array as the "map". If your graph does not change, another strategy is to store the index of the elements in the graph with the elements themselves (just the integer) and then using that index to index into an array, where for each "data map" you have basically one array the size of the number of elements. Both techniques scale very well and provide the best possible access times, unless your data is really sparse (only a fraction of the elements actually need to store the data).
The "Object[]
at Elements" approach:
Object[]
that is package private. Map
interface that provides T getData(Vertex)
and void setData(Vertex, T
)HashMap<Vertex,T>
but the one I was suggesting actually only stores an integer index
that is used to index into the Object[]
arrays at the vertices. createMap
that keeps track of the used and free indices and creates a new instance of the above class whose getter and setter implementations use the package private field of the Vertex class to actually access the dataThe "One Array" approach:
0
, etc.T[]
that has the size of the
number of vertices. For the DFS algorithm I would choose the "one array"-approach as you could use a byte[] (or if "Visited" is all that is required you could even use a BitSet
) for space efficiency and you are likely to populate the data for all vertices in DFS if your graph is connected. This should perform a lot better than a HashMap based approach and does not require the boxing and unboxing for storing the data in the Object[]
.
Upvotes: 3