Maximum Rectangular Area in Histogram

The description of the maximum rectangular area in histogram problem is fairly simple. Given an array of the height of continuous bars with equal width, and the goal is to find the maximum rectangular area inside them. For example, the following figure shows the maximum rectangular area in pink of height array [1, 2, 3, 4, 5], which is 9.

Let's breakdown the problem. To form such a rectangular area, the following constraint must be satisfied.

All bars inside the area is at least as high as the rectangular.

Which suggests that no matter how we choose the rectangular, all involved bars must be continuous and the best height we could reach is limited to the lowest one.

As shown above, the maximum height of the rectangular area is limited to the lowest involved bar. (The height of the lowest one is mark in pink).

And as you may see, the width of the rectangular is not always the wider the better. Because of the wider range, chances are that we may encounter limitation of the height. But also, the wider range it is, it's more possible that a larger rectangular area could be formed.

Let's try a little bit smarter approach that may meet the possibility and avoid the limitation. Let's start from the bar with index 0, remember its height, currently, the maximum height is the height of bar 0. And then we forward to next one. If the next bar is at least as high as currently maximum height, we could continue the process. Otherwise, the maximum height will be limited to the next one. The following picture demonstrates a possible instance.

Thus part of our goal is to try to find out non-descent sequences of the height array. And it could be explained for a very simple reason, the area is calculated by multiplying width and height. Non-descent sequences would make the rectangular as wider as possible. And meanwhile, it also promises the height.

Well then, suppose we started from bar 0 and now encountered a bar which has a lower height than previous maximum height. How do we calculate the maximum area of current subsequence? Let's draw some graphs first to analyze.

We may look back from bar 5, and that would be bar 4 which heights 5. Now, given that the subsequence is a non-descent one, [2, 2, 3, 4, 5], and bar 4 is at the end. Thus the best it could do is $Area=5\times 1=5$.

And the next we come to bar 3 with height 4. The non-descent characteristics would guarantee that all following bars is at least as high as bar 3. And there is 1 more bar followed bar 3. So the area would be the least height promised (which is 4 now) times the number of bars counted from current one (2, bar 3 and bar 4), and it would be $Area=4\times 2$.

Step to bar 2. As aforementioned characteristics, it promises the height is 3 at least. And there are bar 3, bar 4 followed it. Then the area is calculated as the least height, 3, times the number of bars, 3. $Area=3\times 3=9$.

Similarly, as for bar 1, the area could be formed is calculated by the promised height, 2, times the number of bars followed, 4. $Area=2\times 4=8$.

Finally, bar 0. $Area=2\times 5=10$.

At this point, we have processed the subsequence, and the maximum area is 10 by now, and it is formed from bar 0 to bar 4, with height equals to 2.

Let's write this part in code!

int largestRectangleArea(vector<int> heights) {
    int max_area = 0;
    int area = 0;

    int index = 0;
    int current_maximum_height = -1;
    for (;index < heights.size();index++) {
        if (heights[index] < current_maximum_height) {
            break;
        } else { // (heights[index] >= current_maximum_height)
            current_maximum_height = heights[index];
        }
    }

    // we have encountered a bar which has a lower height
    // let's look back
    for (int lookback = index - 1;lookback >= 0;lookback--) {
        // the area would be the least height promised times
        // the number of bars counted from current one
        area = heights[lookback] * (index - lookback);

        // if there is any rectangular has a larger area
        // then that would be our new answer
        if (area > max_area) {
            max_area = area;
        }
    }

    return max_area;
}

But what's next? We still leave bar 5 untouched. Though we could see that the maximum rectangular area would be 6 if bar 5 is involved in, computers won't be able to see that. And in some cases, the non-descent subsequence is not a necessary of forming the rectangular with maximum area. A tiny example below would verify this point.

Back to our previous case. Now we have done the subsequence from bar 0 to bar 4, and currently encounter bar 5, which has a lower height compared to previous maximum height in a row.

This suggests that, if bar 5 is involved in forming the rectangle with either its previous bars or followed bars, the maximum height would be as high as bar 5 at best.

At this point, however, if we look back one by one, the time complexity of this algorithm would be $O(n^2)$ in worst case! The algorithm needs to be smarter.

Let's review the very moment. We can clearly see that there is no bars lower than bar 5, which implies that we could directly calculate from bar 0.

And above figure shows another similar case. In this case, bar 0 and bar 1 is lower than bar 5, and bar 2 is the same as bar 5. This gives us 2 possible rectangles, rendered in green and blue below.

Any lightbulb? Yes, we need to somehow remember the corresponding bars. Why? Because since bar 3 and bar 4 is higher than bar 5, then it would be naturally included in either rectangles. The height of bar 3 and bar 4 won't a limitation. Thus we could straightly start at bar 2 and then look back.

Besides, if you render every possible rectangle followed the algorithm we discussed by far with the case above, you will find that when we first stopped and looked back, we calculated the rectangle from bar 0 to bar 4, with height equals to 2. And then, we calculated a slightly larger rectangle which range from bar 0 to bar 5, with the same height.

And it's also true for bar 2 to bar 4 with height equals to 3, and bar 3 to bar 5 with the same height.

We could save some calculations, right? Because bar 5 is higher than bar 0 and bar 1, and equals to bar 3, then the smaller rectangle is guaranteed to be part of a larger rectangle!

How to avoid the redundant calculations? And how to remember the corresponding bars? Well, from the two examples above, you may find that the key is height.

The bars are lower than or equals to bar 5 don't need to be calculated when we first look back. Because the smaller rectangle will be guaranteed to be part of a larger rectangle.
We go straightly to the bar that comes with the same height as bar 5. Because bar 3 and bar 4 are higher than bar 5, so they're guaranteed to be included.

Now, you may have a feeling that, when we encountered bar 5, which heights 3, then we only need bar 4 and bar 3 for the process of looking back. After that, bar 4 and bar 3 will be ignored, and we then process bar 2, bar 1 and bar 0 with bar 5.

The indexes of bars we used in calculations looks like in a reversed manner! But let's have a break and make some changes according to our analysis of look back.

int largestRectangleArea(vector<int> heights) {
    int max_area = 0;
    int area = 0;

    int index = 0;
    int current_maximum_height = -1;
    for (;index < heights.size();index++) {
        if (heights[index] < current_maximum_height) {
            break;
        } else { // (heights[index] >= current_maximum_height)
            current_maximum_height = heights[index];
        }
    }

    // we have encountered a bar which has a lower height
    // let's look back
    for (int lookback = index - 1;lookback >= 0;lookback--) {
        // we only consider the bars that are higher than current one
        if (heights[lookback] <= heights[index]) {
            break;
        }

        // the area would be the least height promised times
        // the number of bars counted from current one
        area = heights[lookback] * (index - lookback);

        // if there is any rectangular has a larger area
        // then that would be our new answer
        if (area > max_area) {
            max_area = area;
        }
    }

    return max_area;
}

And what if the case given goes as below?

We encountered bar 5 which breaks the non-descent characteristics. And suppose we have done the first look back, which gives 2 rectangles (rendered in pink and blue). Shall we start the look back from bar 5? Or we continue the process of finding non-descent subsequence from bar 5?

If your answer is the former, then we would have the following rectangles.

And then step to bar 6, you will immediately find that these rectangles are guaranteed in larger ones.

So we could come to a conclusion, we always keep finding the non-descent subsequence, until the next bar is lower than current subsequence's maximum height, or reaches the end. Once a subsequence is found, we calculate backwards for all bars higher than the one that breaks the non-descent characteristics.

Take the case above as example, the height array is [2, 2, 3, 4, 5, 3, 4]. Follow the algorithm we updated just now, we may write the index of bars, the operations and our non-descent sequence below. The key steps are shown in the figure below.

Take bar 0. Non-descent satisfied. [0]
Take bar 1. Non-descent satisfied. [0, 1]
Take bar 2. Non-descent satisfied. [0, 1, 2]
Take bar 3. Non-descent satisfied. [0, 1, 2, 3]
Take bar 4. Non-descent satisfied. [0, 1, 2, 3, 4]
Take bar 5. Non-descent break. [0, 1, 2, 3, 4]
Look bar 4, with boundary in range (3, 5). Area is 5. [0, 1, 2, 3]
Look bar 3, with boundary in range (2, 5). Area is 8. [0, 1, 2]
Look bar 2, with boundary in range (1, 5). Bar 2 <= bar 5, stop look back. [0, 1, 2]
Take bar 5. Non-descent satisfied. [0, 1, 2, 5]
Take bar 6. Non-descent satisfied. [0, 1, 2, 5, 6]
End. Non-descent break.[0, 1, 2, 5, 6]
Look bar 6, with boundary in range (5, 7). Area is 4. [0, 1, 2, 5]
Look bar 5, with boundary in range (2, 7). Area is 12. [0, 1, 2].
Look bar 2, with boundary in range (1, 7). Area is 15. [0, 1].
Look bar 1, with boundary in range (0, 7). Area is 12. [0].
Look bar 0, with boundary in range (-1, 7). Area is 14. []

If you look at the sequence arrays, you'll find that the way we put in and take out is exactly the same manner as a stack would do!

And we could implement the algorithm in real code now.

#include <iostream>
#include <vector>
#include <stack>

using namespace std;

#define DEBUG // comment this whole line to turn off debug info
#ifdef DEBUG
#define debug_printf(...) do{ printf( __VA_ARGS__ ); } while ( false )
#else
#define debug_printf(...) static_cast<void>(0)
#endif

void debug_stack(stack<int> &s) {
#ifdef DEBUG
    // stack<> debug helper function from
    // https://stackoverflow.com/a/18851292
    if(s.empty())
    {
        cout << endl;
        return;
    }
    int x= s.top();
    s.pop();
    debug_stack(s);
    s.push(x);
    debug_printf("%d ", x);
#endif
}

int maximum_rectangle_area(vector<int> heights) {
    int max_area = 0;
    int area = 0;

    // add a bar that has no height for convenience
    heights.push_back(0);
    stack<int> non_descent_seq;
    int index = 0;
    for (;index < heights.size();) {
        if (non_descent_seq.empty() || heights[non_descent_seq.top()] <= heights[index]) {
            // (heights[index] >= current_maximum_height)
            // non-descent satisfied
            non_descent_seq.push(index);
            index++;

            // debug
            debug_printf("non_descent_seq satisfied: ");
            debug_stack(non_descent_seq);
            debug_printf("\n");
        } else {
            // we have encountered a bar which has a lower height
            // let's look backwards
            while (!non_descent_seq.empty()) {
                int bar_looking_at = non_descent_seq.top();

                if (heights[bar_looking_at] <= heights[index]) {
#ifdef DEBUG
                    int top = non_descent_seq.top();
                    non_descent_seq.pop();
                    debug_printf("Look bar %d, with boundary in range (%d, %d). Bar %d <= bar %d, stop look back.\n", bar_looking_at, non_descent_seq.top(), index, bar_looking_at, index);
                    non_descent_seq.push(top);
#endif
                    break;
                } else {
                    non_descent_seq.pop();
                    if (non_descent_seq.empty()) {
                        // lowest height * number of bars
                        area = heights[bar_looking_at] * index;

                        debug_printf("1. area: %d*%d\n", heights[bar_looking_at], index);
                    } else {
                        // area = promised height * number of bars followed
                        area = heights[bar_looking_at] * (index - non_descent_seq.top() - 1);

                        // debug
                        debug_printf("Look bar %d, with boundary in range (%d, %d)\n", bar_looking_at, non_descent_seq.top(), index);
                        debug_printf("non_descent_seq satisfied: ");
                        debug_stack(non_descent_seq);
                        debug_printf("\n");
                        debug_printf("2. area: %d*%d\n", heights[bar_looking_at], index - non_descent_seq.top() - 1);
                    }
                    debug_printf("\n");
                    if (area > max_area) {
                        max_area = area;
                    }
                }
            }

        }
    }
    return max_area;
}

int main(int argc, char *argv[]) {
    cout << maximum_rectangle_area({2,2,3,4,5,3,4}) << '\n';
}

Cocoa

Maximum Rectangular Area in Histogram

One thought on “Maximum Rectangular Area in Histogram”

Leave a Reply Cancel reply

いまが最高！