Using iGraph in python for community detection and writing community number for each node to CSV

CurtLH picture CurtLH · Aug 12, 2014 · Viewed 11.4k times · Source

I have an network that I would like to analyze using the edge_betweenness community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX.

My ultimate goal is to run edge_betweenness community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.

Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.

input data ('network.txt'):

1 2
2 3
2 7
3 1
4 2
4 6
5 4
5 6
7 4
7 8
8 9
9 7
10 7
10 8
10 9

iGraph code

import igraph

# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')

# plot graph
igraph.plot(g)

igraph.plot(g)

# identify communities
communities = igraph.community_edge_betweenness()

# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)

What do I need to do to find the optimal number of communities and write which community each node in the graph belongs to a list?

Answer

Tamás picture Tamás · Aug 12, 2014

You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities). Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count. Once you've done that, you get a VertexClustering object with a membership property which gives you the cluster index for each vertex in the graph.

For sake of clarity, I'm renaming your communities variable to dendrogram because the edge betweenness community detection algorithm actually produces a dendrogram::

# calculate dendrogram
dendrogram = graph.community_edge_betweenness()
# convert it into a flat clustering
clusters = dendrogram.as_clustering()
# get the membership vector
membership = clusters.membership

Now we can start writing the membership vector along with the node names into a CSV file::

import csv
from itertools import izip

writer = csv.writer(open("output.csv", "wb"))
for name, membership in izip(graph.vs["name"], membership):
    writer.writerow([name, membership])

If you are using Python 3, use zip instead of izip and there is no need to import itertools.