Reputation: 54074
In case I decide to use the "adjacency matrix" representation to represent a graph, how would I know what the size of the matrix should be?
All of the code examples I have seen, assume that the size is given to the Graph
object to initialize the matrix, but it is not clear to me where did this size came from.
I mean the I assume that to build, a graph the data of the graph are loaded e.g. from a file, and the best approach would be to create the graph as you read the file (right so far?).
This is very easy using the alternative representation which is lists.
But in the case of matrix
we don't know the number of vertices until the file is actually loaded. I don't believe that we first read the file (which could contain millions of vertices) and then build the graph. It seems to me like double processing.
So what is the usual/best way?
Note: I am aware of benefits of using lists instead, but since most docs say that on dense graphs, using a matrix is acceptable (but not for sparse) I am trying to understand how a program using the matrix
representation is implemented properly.
Upvotes: 0
Views: 1595
Reputation: 46862
the usual/best way is given in the comments: you reallocate and copy. if you scale the size each time (eg double) the amortized (average) cost comes out as O(n), as far as i remember. this is how, for example, python has lists that grow to be any size you need. it's completely standard/common.
[i'm not really posting for points here; just frustrated that the answers in comments are what you need, but that you may not be paying attention to them or understanding them]
Upvotes: 1
Reputation: 2804
You can estimate the size well if the input format is under your control. In the example below, all of the values have a fixed length (6 chars) with 0 padding to the left. This means that each line will be 22 bytes in size. When you divide the size of the file by 22, you get the approximation of the matrix size. If the graph is complete, the file size is equal to nxn
, where n is the number of vertices.
Example:
000001 000020 000030
000800 000001 000040
etc...
The other solution is to double the size of the matrix each time the size is exceeded. The amortized computational cost will be small. On the other hand, memory could be a problem.
Upvotes: 1
Reputation: 3859
If you don't have a count of vertices to begin with I believe you can only init the matrix with a boilerplate number. Let's say the input file only contains lines of the form x y z
which means there's an edge from x
to y
with an (optional) cost z
. Notice that the number of lines in the input file is the edge count of the graph, assuming the file contains nothing else. Since you already assume it is a dense graph, also assume that it is a complete graph which has number of edges equal to n(n-1)/2
where n
is the number of vertices. With this information you can come up with a close enough approximation of the number of vertices. For example, if the edge count from the input file is 1000000
then solving the equation
n*(n-1)/2 = 1000000
we arrive at n
approximately equal to 1415
. Initializing the matrix to 1415*1415
array should cover you nicely.
Of course all of this is based on the fact that you have a fast enough way to read the number of lines in the input file.
Upvotes: 1