Weighted Quick-Union with Path Compression algorithm

Aleksei Chepovoi picture Aleksei Chepovoi · Oct 2, 2012 · Viewed 13.4k times · Source

There is a "Weighted Quick-Union with Path Compression" algorithm.

The code:

public class WeightedQU 
{
    private int[] id;
    private int[] iz;

    public WeightedQU(int N)
    {
        id = new int[N];
        iz = new int[N];
        for(int i = 0; i < id.length; i++)
        {
            iz[i] = i;
            id[i] = i;
        }
    }

    public int root(int i)
    {
        while(i != id[i])
        {
            id[i] = id[id[i]];   // this line represents "path compression"
            i = id[i];
        }
        return i;
    }

    public boolean connected(int p, int q)
    {
        return root(p) == root(q);
    }

    public void union(int p, int q)   // here iz[] is used to "weighting"
    {
        int i = root(p);
        int j = root(q);
        if(iz[i] < iz[j])
        {
            id[i] = j;
            iz[j] += iz[i];
        }
        else
        {
            id[j] = i;
            iz[i] += iz[j];
        }
    }
}

Questions:

  1. How does the path compression work? id[i] = id[id[i]] means that we reach only the second ancester of our node, not the root.

  2. iz[] contains integers from 0 to N-1. How does iz[] help us know the number of elements in the set?

Can someone clarify this for me?

Answer

Andrew Tomazos picture Andrew Tomazos · Oct 2, 2012

First understand that id is a forest. id[i] is the parent of i. If id[i] == i it means that i is a root.

For some root i (where id[i] == i) then iz[i] is the number of elements in the tree rooted at i.

public int root(int i)
{
    while(i != id[i])
    {
        id[i] = id[id[i]];   // this line represents "path compression"
        i = id[i];
    }
    return i;
}

How does the path compression work? id[i] = id[id[i]] means that we reach only the second ancester of our node, not the root.

As we are ascending the tree to find the root we move nodes from their parents to their grandparents. This partially flattens the tree. Notice that this operation doesn't change which tree the node is a member of, this is all we are interested in. This is the path compression technique.

(You did notice the loop right? while(i == id[i]) terminates once i is a root node)

iz[] contains integers from 0 to N-1. How does iz[] help us know the number of elements in the set?

There is a transcription error in the code:

for(int i = 0; i < id.length; i++)
{
    iz[i] = i; // WRONG
    id[i] = i;
}

This is the correct version:

for(int i = 0; i < id.length; i++)
{
    iz[i] = 1; // RIGHT
    id[i] = i;
}

iz[i] is the number of elements for a tree rooted at i (or if i is not a root then iz[i] is undefined). So it should be initialized to 1, not i. Initially each element is a seperate "singleton" tree of size 1.