Largest rectangular sub matrix with the same number

TimeToCodeTheRoad picture TimeToCodeTheRoad · Oct 14, 2011 · Viewed 9.9k times · Source

I am trying to come up with a dynamic programming algorithm that finds the largest sub matrix within a matrix that consists of the same number:

example:

{5 5 8}
{5 5 7}
{3 4 1}

Answer : 4 elements due to the matrix

   5 5 
   5 5   

Answer

TMS picture TMS · Oct 14, 2011

This is a question I already answered here (and here, modified version). In both cases the algorithm was applied to binary case (zeros and ones), but the modification for arbitrary numbers is quite easy (but sorry, I keep the images for the binary version of the problem). You can do this very efficiently by two pass linear O(n) time algorithm - n being number of elements. However, this is not a dynamic programming - I think using dynamic programming here would be clumsy and inefficient in the end, because of the difficulties with problem decomposition, as the OP mentioned - unless its a homework - but in that case you can try to impress by this algorithm :-) as there's obviously no faster solution than O(n).

Algorithm (pictures depict binary case):

Say you want to find largest rectangle of free (white) elements.

enter image description here

Here follows the two pass linear O(n) time algorithm (n being number of elemets):

1) in a first pass, go by columns, from bottom to top, and for each element, denote the number of consecutive elements available up to this one:

enter image description here

repeat, until:

enter image description here

Pictures depict the binary case. In case of arbitrary numbers you hold 2 matrices - first with the original numbers and second with the auxiliary numbers that are filled in the image above. You have to check the original matrix and if you find a number different from the previous one, you just start the numbering (in the auxiliary matrix) again from 1.

2) in a second pass you go by rows, holding data structure of potential rectangles, i.e. the rectangles containing current position somewhere at the top edge. See the following picture (current position is red, 3 potential rectangles - purple - height 1, green - height 2 and yellow - height 3):

enter image description here

For each rectangle we keep its height k and its left edge. In other words we keep track of the sums of consecutive numbers that were >= k (i.e. potential rectangles of height k). This data structure can be represented by an array with double linked list linking occupied items, and the array size would be limited by the matrix height.

Pseudocode of 2nd pass (non-binary version with arbitrary numbers):

var m[] // original matrix
var aux[] // auxiliary matrix filled in the 1st pass
var rect[] // array of potential rectangles, indexed by their height
           // the occupied items are also linked in double linked list, 
           // ordered by height

foreach row = 1..N // go by rows
    foreach col = 1..M
        if (col > 1 AND m[row, col] != m[row, col - 1]) // new number
            close_potential_rectangles_higher_than(0);  // close all rectangles

        height = aux[row, col] // maximal height possible at current position

        if (!rect[height]) { // rectangle with height does not exist
            create rect[height]    // open new rectangle
            if (rect[height].next) // rectangle with nearest higher height
                                   // if it exists, start from its left edge
                rect[height].left_col = rect[height].next.left_col
            else
                rect[height].left_col = col; 
        }

        close_potential_rectangles_higher_than(height)
    end for // end row
    close_potential_rectangles_higher_than(0);
        // end of row -> close all rect., supposing col is M+1 now!

end for // end matrix

The function for closing rectangles:

function close_potential_rectangles_higher_than(height)
    close_r = rectangle with highest height (last item in dll)
    while (close_r.height > height) { // higher? close it
        area = close_r.height * (col - close_r.left_col)
        if (area > max_area) { // we have maximal rectangle!
            max_area = area
            max_topleft = [row, close_r.left_col]
            max_bottomright = [row + height - 1, col - 1]
        }
        close_r = close_r.prev
        // remove the rectangle close_r from the double linked list
    }
end function

This way you can also get all maximum rectangles. So in the end you get:

enter image description here

And what the complexity will be? You see that the function close_potential_rectangles_higher_than is O(1) per closed rectangle. Because for each field we create 1 potential rectangle at the maximum, the total number of potential rectangles ever present in particular row is never higher than the length of the row. Therefore, complexity of this function is O(1) amortized!

So the whole complexity is O(n) where n is number of matrix elements.