Python connected components

fege picture fege · Apr 24, 2012 · Viewed 29.2k times · Source

I'm writing a function get_connected_components for a class Graph:

def get_connected_components(self):
    path=[]
    for i in self.graph.keys():
        q=self.graph[i]
        while q:
            print(q)
            v=q.pop(0)
            if not v in path:
                path=path+[v]
    return path

My graph is:

{0: [(0, 1), (0, 2), (0, 3)], 1: [], 2: [(2, 1)], 3: [(3, 4), (3, 5)], \
4: [(4, 3), (4, 5)], 5: [(5, 3), (5, 4), (5, 7)], 6: [(6, 8)], 7: [], \
8: [(8, 9)], 9: []}

where the keys are the nodes and the values are the edge. My function gives me this connected component:

[(0, 1), (0, 2), (0, 3), (2, 1), (3, 4), (3, 5), (4, 3), (4, 5), (5, 3), \
(5, 4), (5, 7), (6, 8), (8, 9)]

But I would have two different connected components, like:

[[(0, 1), (0, 2), (0, 3), (2, 1), (3, 4), (3, 5), (4, 3), (4, 5), \
(5, 3), (5, 4), (5, 7)],[(6, 8), (8, 9)]]

I don't understand where I made the mistake. Can anyone help me?

Answer

pillmuncher picture pillmuncher · Dec 12, 2012

I like this algorithm:

def connected_components(neighbors):
    seen = set()
    def component(node):
        nodes = set([node])
        while nodes:
            node = nodes.pop()
            seen.add(node)
            nodes |= neighbors[node] - seen
            yield node
    for node in neighbors:
        if node not in seen:
            yield component(node)

Not only is it short and elegant, but also fast. Use it like so (Python 2.7):

old_graph = {
    0: [(0, 1), (0, 2), (0, 3)],
    1: [],
    2: [(2, 1)],
    3: [(3, 4), (3, 5)],
    4: [(4, 3), (4, 5)],
    5: [(5, 3), (5, 4), (5, 7)],
    6: [(6, 8)],
    7: [],
    8: [(8, 9)],
    9: []}

edges = {v for k, vs in old_graph.items() for v in vs}
graph = defaultdict(set)

for v1, v2 in edges:
    graph[v1].add(v2)
    graph[v2].add(v1)

components = []
for component in connected_components(graph):
    c = set(component)
    components.append([edge for edges in old_graph.values()
                            for edge in edges
                            if c.intersection(edge)])

print(components)

The result is:

[[(0, 1), (0, 2), (0, 3), (2, 1), (3, 4), (3, 5), (4, 3), (4, 5), (5, 3), (5, 4), (5, 7)],
 [(6, 8), (8, 9)]]

Thanks, aparpara for spotting the bug.