Reputation: 412
I am working on DBLP dataset( contains the metadata of over 1.8 million publications, written by over 1 million authors in several thousands of journals or conference proceedings series) which has following columns -
['id', 'title', 'authors', 'year', 'pub_venue', 'ref_id', 'ref_num', 'abstract']
I have to apply community detection algorithm on given dataset. My requirement is to get overlapping communities. For this I created a graph in igraph using above data, where id is a vertex value and ids in ref_id will be used to create edges. I tried different community detection algorithm available in igraph but not getting desired result-
I am using -
community_multilevel()
The resultant clusters I am getting from this algorithm is giving me only partition with best modularity. I want to understand how to get clusters at different level or dendogram ?
Edit : I used community_multilevel(return_levels= True) and for the above dataset ,which makes a sparse graph, my expectation was to get dense community at higher level but the no of community I am getting at each level is nearly same not much reduced. I need something similar to parition_at_level in networkx.
Total no of vertices : 1632441
cl = g.community_multilevel(return_levels=True)
print len(cl[0]) , len(cl[1]) , len(cl[2]) , len(cl[3])
output is : 1207787 1164960 1162115 1161959
Upvotes: 0
Views: 1324
Reputation: 48101
Please read the documentation of community_multilevel
- it has a return_levels
argument; setting it to True
will return a list of disjoint community structures, one for each relevant resolution level identified by the algorithm.
Note that this won't be a true "overlapping" community structure, though, as each level identified by the algorithm will have disjoint communities.
Upvotes: 1