Reputation: 1777
So I'm starting a project on directed graphs and topological sorting. I'm looking for the most efficient way to parse an input file of courses in the form:
COURSE_1 COURSE_2
where COURSE_1 is a prerequisite to COURSE_2
NONE COURSE_3
where COURSE_3 has no prerequisites.
All vertex labels will be strings, obviously. I have created a Graph
class with data members to store vertices, edges, and vertex labels. Later, I will be adding methods to topologically sort and find the shortest path from one vertex to another. Given these future plans, my questions here are would it be better to use an adjacency list? or matrix? Also what would be the most efficient way to populate the graph from the input files? My initial thought was using an adjacency list. Since the size of the graph isn't known at compile time my idea
std::vector<std::list<std::string>> adjacencyList;
Also, I thought of creating a helper function to parse the input file. Something along the lines of
void populateGraph(std::string filename, Graph* graph)
Am I totally going in the wrong direction here?
Upvotes: 1
Views: 1453
Reputation: 317
You're on the right track with a lot of things.
If you can assume that all the inputs you will encounter are just NONE
and COURSE_X
and you can assume that all Xes form a continuous interval of ints starting from 1 (or 0) and span to the number of vertices, then you could just treat the vertices as numbers internally. If that is not the case, you can assign each vertex label a number (using std::unordered_map for example) and have this abstract structure anyway.
Now, if you choose to stick with this model, it is convenient to use, because your whole graph can be represented as std::vector<std::list<int>>
. You could substitute int
with some struct type if you want to store more info about the edge, like labels, weights, etc. Whenever you want to access a specific node's adjacency list, you just access the vector's cell under the node's id.
Obviously this is an adjacency list based solution. It's good for sparse graphs generally speaking. Think of it this way: if you're using a matrix for a sparse graph, a very big part of that allocated memory is not going to be used. Using adjacency graph eliminates this, but it takes away the constant access time to any arbitrary edge. This means that it can take linear time to check if a given edge exists. That being said, I wouldn't expect you to use that check in a topological sort. If you chose to use a matrix however, you would still need to map the vertices to numbers, to that part stays.
Last but not least, you can use pointers instead of integer ids. Basically you wouldn't look up vertices in the vector using an id, but you'd directly access a vertice through a stored pointer. You'd most likely still need to have the string -> pointer mapping, at least for the graph creation.
Upvotes: 1
Reputation: 1762
Concerning the choice between the adjacency list and the matrix, it depends on the nature of your graph and what you intend to do with it.
See this thread for example.
Also, did you have a look at what boost library offers for graphs ? There is also lemon as an alternative. Could be worth checking if what you intend to implement already exists.
Upvotes: 0