Fast calculation of eigenvector centrality takes too long in networkx

Question

I am using networkx to calculate eigenvector centrality. The problem is that it takes way too long (already running for about 6 hours). Is there a faster way to get the results?

There are around 200,000 nodes in the graph and 60,000,000 edges.

seralouk · Accepted Answer

By looking at the source code, networkx.algorithms.centrality.eigenvector uses the power method to find the leading eigenvector.

If you stick to networkx use this as Joel noticed:

eigenvector_centrality_numpy

centrality = nx.eigenvector_centrality_numpy(G)

Alternatively:

You can use scipy.sparse.linalg.eigs that uses the ARPACK and request only 1 eigenvector to be returned.

Toy example:

import scipy.sparse as sparse

X = np.array() # dimensions 200000 by 200000 as the adjacency
# Note: k=1 and you request the Largest real.
vals, vecs = sparse.linalg.eigs(X, k=1, which='LR')

In any case, 2000000 by 200000 is big and depending on the sparsity and the nature of the matrix, the algorithm may take long. You will also need a lot of CPU and RAM.

Additional tip for networkx.algorithms.centrality.eigenvector:

If you stick with networkx try to relax the tolerance:

eigenvector_centrality(G, max_iter=100, tol=1e-06, nstart=None, weight=None)

Try setting tol=1e-04 or even tol=1e-03

Fast calculation of eigenvector centrality takes too long in networkx

Answers (2)

Related Questions