JRun
JRun

Reputation: 679

python igraph Error at fast_community.c:553

First, thanks for reading and possibly responding to this.

Now, the question: I am on python 2.7, and I am getting this error when attempting to find communities in my graph using the fastgreedy algorithm:

    ---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
<ipython-input-180-3b8456851658> in <module>()
----> 1 dendrogram = g_summary.community_fastgreedy(weights=edge_frequency.values())

/usr/local/lib/python2.7/site-packages/igraph/__init__.pyc in community_fastgreedy(self, weights)
    959           in very large networks. Phys Rev E 70, 066111 (2004).
    960         """
--> 961         merges, qs = GraphBase.community_fastgreedy(self, weights)
    962 
    963         # qs may be shorter than |V|-1 if we are left with a few separated

InternalError: Error at fast_community.c:553: fast-greedy community finding works only on graphs without multiple edges, Invalid value

This is how I created my graph:

import igraph as ig
vertices = words  #about 600 words from a number of news articles: ['palestine', 'israel', 'hamas, 'nasa', 'mercury', 'water', ...]

gen = ig.UniqueIdGenerator()
[gen[word] for word in vertices]  #generate word-to-integer mapping as each edge has to be between integer ids (words)

edges = []
for ind in xrange(articles.shape[0]):  #  articles is a pandas dataframe; each row corresponds to an article; one column is 'top_words' which includes the top few words of each article.  The above list *words* is the unique union set of top_words for all articles.
    words_i = articles['top_words'].values[ind]  # for one article, this looks like ['palestine','israel','hamas']
    edges.extend([(gen[x[0]],gen[x[1]]) for x in combinations(words_i,2)])  #basically there is an edge for each pair of top_words in a given article. For the example article above, we get edges between israel-palestine, israel-hamas, palestine-hamas.

unique_edges = list(set(edges))
unique_edge_frequency = {}
for e in unique_edges:
    unique_edge_frequency[e] = edges.count(e)    

g = ig.Graph(vertex_attrs={"label": vertices}, edges=unique_edges, directed=False)
g.es['width'] = np.asarray([unique_edge_frequency[e] for e in unique_edge_frequency.keys()])*1.0/max(unique_edge_frequency.values())  

And this is what throws the error:

dendrogram = g.community_fastgreedy(weights=g.es['width'])

What am I doing wrong?

Upvotes: 0

Views: 2187

Answers (1)

Tam&#225;s
Tam&#225;s

Reputation: 48071

Your graph contains multiple edges (i.e. more than one edge between the same pair of nodes). The fast greedy community detection won't work on such graphs; you have to collapse the multiple edges into single ones with g.simplify().

It also seems like you are trying to set the "width" attibute of the edges based on how many edges there are between the same pair of vertices. Instead of constructing unique_edges and then unique_edge_frequency, you can simply do this:

g = Graph(edges, directed=False)
g.es["width"] = 1
g.simplify(combine_edges={ "width": "sum" })

This will simply create a graph with multiple edges first, then assign a width of 1 to each edge, and finally collapse the multiple edges into single ones while summing up their widths.

Upvotes: 3

Related Questions