Boolean matrix multiplication algorithm

user4452139 picture user4452139 · Apr 12, 2015 · Viewed 8k times · Source

This is my first question on stackoverflow. I've been solving some exercises from "Algorithm design" by Goodrich, Tamassia. However, I'm quite clueless about this problem. Unusre where to start from and how to proceed. Any advice would be great. Here's the problem:

Boolean matrices are matrices such that each entry is 0 or 1, and matrix multiplication is performed by using AND for * and OR for +. Suppose we are given two NxN random Boolean matrices A and B, so that the probability that any entry in either is 1, is 1/k. Show that if k is a constant, then there is an algorithm for multiplying A and B whose expected running time is O(n^2). What if k is n?

Answer

samgak picture samgak · Apr 13, 2015

Matrix multiplication using the standard iterative approach is O(n3), because you have to iterate over n rows and n columns, and for each element do a vector multiply of one of the rows and one of the columns, which takes n multiplies and n-1 additions.

Psuedo code to multiply matrix a by matrix b and store in matrix c:

for(i = 0; i < n; i++)
{
    for(j = 0; j < n; j++)
    {
        int sum = 0;
        for(m = 0; m < n; m++)
        {
            sum += a[i][m] * b[m][j];
        }
        c[i][j] = sum;
    }
}

For a boolean matrix, as specified in the problem, AND is used in place of multiplication and OR in place of addition, so it becomes this:

for(i = 0; i < n; i++)
{
    for(j = 0; j < n; j++)
    {
        boolean value = false;
        for(m = 0; m < n; m++)
        {
            value ||= a[i][m] && b[m][j];
            if(value)
                break; // early out
        }
        c[i][j] = value;
    }
}

The thing to notice here is that once our boolean "sum" is true, we can stop calculating and early out of the innermost loop, because ORing any subsequent values with true is going to produce true, so we can immediately know that the final result will be true.

For any constant k, the number of operations before we can do this early out (assuming the values are random) is going to depend on k and will not increase with n. At each iteration there will be a (1/k)2 chance that the loop will terminate, because we need two 1s for that to happen and the chance of each entry being a 1 is 1/k. The number of iterations before terminating will follow a Geometric distribution where p is (1/k)2, and the expected number of "trials" (iterations) before "success" (breaking out of the loop) doesn't depend on n (except as an upper bound for the number of trials) so the innermost loop runs in constant time (on average) for a given k, making the overall algorithm O(n2). The Geometric distribution formula should give you some insight about what happens if k = n. Note that in the formula given on Wikipedia k is the number of trials.