Finding separate graphs within a graph object in networkx

LittleBobbyTables picture LittleBobbyTables · Feb 12, 2014 · Viewed 8.4k times · Source

I have an enormous graph dataset - let's say it is like this, but on a much bigger level:

1 -> 2
3 -> 4

1,2,3,4 are nodes and the arrows are directed edges. Let's say that they are all in a single graph object:

import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)

Given an object like this, which has two mini graphs within a graph, how can we pull out each mini graph? I feel like there must be some word for this? My end result would look like:

for mini_graph in G:
    print mini_graph.nodes()

...
[1,2]
[3,4]

Answer

Bonlenfum picture Bonlenfum · Feb 13, 2014

If the parts of the graph are truly disjoint (as per your small example), then consider extracting the subgraphs with connected_component_subgraphs().

This only works on an undirected graph, so if you are using a directed graph then you'll need to convert to undirected first.

import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)

# make an undirected copy of the digraph
UG = G.to_undirected()

# extract subgraphs
sub_graphs = nx.connected_component_subgraphs(UG)

for i, sg in enumerate(sub_graphs):
    print "subgraph {} has {} nodes".format(i, sg.number_of_nodes())
    print "\tNodes:", sg.nodes(data=True)
    print "\tEdges:", sg.edges()

which yields:

subgraph 1 has 2 nodes
    Nodes: [(1, {}), (2, {})]
    Edges: [(1, 2)]
subgraph 1 has 2 nodes
    Nodes: [(3, {}), (4, {})]
    Edges: [(3, 4)]

and you could use the subgraph node labels to operate on your data in the initial graph,

sg.nodes()[0] in G
>>>  True

Reading the answer linked by EdChum, it appears that weakly_connected_component_subgraphs() operates on a directed graph but treats it as undirected, so saving the copy might be crucial. However, the docs on this and the related function weakly_connected_components() are a bit thin at present.