I have an enormous graph dataset - let's say it is like this, but on a much bigger level:
1 -> 2
3 -> 4
1,2,3,4 are nodes and the arrows are directed edges. Let's say that they are all in a single graph object:
import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)
Given an object like this, which has two mini graphs within a graph, how can we pull out each mini graph? I feel like there must be some word for this? My end result would look like:
for mini_graph in G:
print mini_graph.nodes()
...
[1,2]
[3,4]
If the parts of the graph are truly disjoint (as per your small example), then consider extracting the subgraphs with connected_component_subgraphs()
.
This only works on an undirected graph, so if you are using a directed graph then you'll need to convert to undirected first.
import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4])
G.add_edge(1,2)
G.add_edge(3,4)
# make an undirected copy of the digraph
UG = G.to_undirected()
# extract subgraphs
sub_graphs = nx.connected_component_subgraphs(UG)
for i, sg in enumerate(sub_graphs):
print "subgraph {} has {} nodes".format(i, sg.number_of_nodes())
print "\tNodes:", sg.nodes(data=True)
print "\tEdges:", sg.edges()
which yields:
subgraph 1 has 2 nodes
Nodes: [(1, {}), (2, {})]
Edges: [(1, 2)]
subgraph 1 has 2 nodes
Nodes: [(3, {}), (4, {})]
Edges: [(3, 4)]
and you could use the subgraph node labels to operate on your data in the initial graph,
sg.nodes()[0] in G
>>> True
Reading the answer linked by EdChum, it appears that weakly_connected_component_subgraphs()
operates on a directed graph but treats it as undirected, so saving the copy might be crucial. However, the docs on this and the related function weakly_connected_components()
are a bit thin at present.