• An Interesting Exercise in Dynamic Programming

    Given an array of building heights (with unit width), calculate the area of the largest rectangle that “fits” within these buildings.

    For example, for an array A = [3,2,3,2,2,1]:

     x   x
     x x x x x
     x x x x x x
    [3,2,3,2,2,1]
    

    the correct answer is 10.

    You might want to attempt the question first before proceeding further.


    The brute force method takes O(n^3) time by taking each pair and finding the minimum, which is the “height”, and multiplying it by the “width”.

    However, there’s actually a linear time solution.

    Let us iterate through the building array from left to right.

    For some building i, we want to find out the left and right boundaries of the largest rectangle whose height is the height of building i and which includes building i itself.

    Let’s talk about finding the left boundary first. To do this, we can, for each index i, iterate leftwards and check. This results in a quadratic time solution. However, we can do better than this. The insight to achieving linear time is the fact that, when looking for the boundary of the rectangle, we can “throw away” buildings to the left of i which are higher than building i itself. In effect, we are only looking for the building that will “prevent” the rectangle from extending further. The reason we can do this safely is because, for future calculations (of buildings to the right of building i), these buildings won’t be considered in any case because the (current) building i is shorter than them and would be “bottlenecking” them.

    We can use a stack to do this. For a building i, we push it onto the stack if it’s higher than the building of the stack. If it’s not, we continuously pop buildings off the stack until the building on the top of the stack is shorter than building i. Since each building is pushed on and popped off the stack at most once, this results in an amortized constant time check for each building i. We repeat this linear-time procedure twice, one in each direction of the array, to obtain the left and right rectangle indices for each building in the array.

    At the end, we can calculate the largest rectangle by iteratively taking the difference in matching indices in both the left and right indices table and multiplying it by the height of the building.

    # largest_rectangle :: [Int] -> Int
    def largest_rectangle(arr):
    
        stack = []
    
        left_indices = []
        right_indices = []
    
        for i  in range(len(arr)):
            while stack and arr[stack[-1]] >= arr[i]:
                stack.pop()
    
            # if the stack is empty, it means we've extended the rectangle
            # all the way to the leftmost building
            # the left boundary index is set to -1, which means it includes building 0
    
            left_indices.append(-1 if not stack else stack[-1])
            stack.append(i)
    
        # empty the stack for the right pass
    
        stack = []
    
        for i in range(len(arr) - 1, -1, -1):
            while stack and arr[stack[-1]] >= arr[i]:
                stack.pop()
    
            # if the stack is empty, it means we've extended the rectangle
            # all the way to the rightmost building
            # the right boundary index is set to len(arr), which means it includes building len(arr) - 1
    
            right_indices = [len(arr) if not stack else stack[-1]] + right_indices
            stack.append(i)
    
        max_area = 0
    
        # arr[i] is the height of the current building
        # (right_indices[i] - left_indices[i] - 1) is the width
    
        for i in range(len(arr)):
            max_area = max(max_area, arr[i] * (right_indices[i] - left_indices[i] - 1))
    
        return max_area
    

    Now, a different question:

    Given a 2-dimensional m by n matrix with only 0s and 1s, calculate the area of the largest rectangle that contains only 1s.

    For example, for the matrix below:

    [1,1,0,1,0]
    [1,1,1,0,0]
    [0,1,1,1,1]
    [0,1,1,1,1]
    

    The correct answer is 8.

    You can give it a try too, before proceeding.


    Surprisingly, there is also a linear time solution for this problem.

    The insight to this problem is two-fold. We can actually make use of the technique above, with some preprocessing.

    The preprocessing step consists of calculating, for each A[i][j], the maximum “height” of a downward extension. If A[i][j] is 0, then the height is 0.

    For the matrix above, the preprocessed height table would be:

    [2,4,0,1,0]
    [1,3,3,0,0]
    [0,2,2,2,2]
    [0,1,1,1,1]
    

    This preprocessing can be done in O(m*n), or linear time, if we start iterating from the last row upwards.

    With this table, we can make use of the technique above. By feeding each row of this height table into the largest_rectangle function above, we can calculate the area of the largest rectangle whose top edge touches that row. If we do this for all the rows, we can calculate the largest possible area for the entire matrix.

    You can imagine it like this for the first row:

    [2,4,0,1,0]
     x x   x
     x x
       x
       x
    

    which yields an area of 4 corresponding the 2x2 rectangle in the upper-left corner of the matrix.

    Or the overall correct solution, which is the largest rectangle for the third row:

    [0,2,2,2,2]
       x x x x
       x x x x
    

    The full solution in Python is below:

    # largest_2d_subarray :: [[Int]] -> Int
    def largest_2d_subarray(matrix):
    
        # we initialize the table to the same size as the matrix, containing all 0s
    
        max_height_table = [ [0] * len(matrix[0]) for i in range(len(matrix)) ]
    
        # we start preprocessing from the last row and work our way upwards
    
        for i in range(len(matrix) - 1, -1, -1):
    
            # special case for last row
            # we simply copy the last row of the matrix over to the height table
            # since the height can only be 0 or 1
            if i == len(matrix) - 1:
                max_height_table[len(matrix) - 1] = row
                continue
    
            for j, column in enumerate(matrix[i]):
                if column == 0:
                    continue
                max_height_table[i][j] = max_height_table[i + 1][j] + 1
    
        # we can now feed the preprocessed table into our largest_rectangle function
    
        max_area = 0
    
        for i in range(len(matrix)):
            largest_subarray_area = largest_rectangle(max_height_table[i])
            max_area = max(max_area, largest_subarray_area)
    
        return max_area
    

    As mentioned, it takes linear time for the preprocessing. After that, it takes O(n) time to calculate the largest rectangle whose top is in contact with that row. For m rows, it takes O(m*n) time. Thus, this algorithm runs in overall linear time.

  • 0-1 Knapsack Problem in Python

    The 0-1 Knapsack problem is a variation on the knapsack problem with one condition: there is only one copy of each item. You may choose to pick it or not. Given this condition, it’s possible to iterate through the items and memoize the decisions sequentially.

    At each iteration of i and j:

    1. Check if the current item is larger than the capacity. If it is, ignore the item. In this context, ignoring the item means that for this particular combination of i and j, the maximum value is the same as the previous value of i.
    2. If the current item can fit in the knapsack, then the maximum value for this combination of i and j is the larger of the values resulting from both outcomes. table[i-1][j] represents the value if you choose to ignore the item, and table[i-1][j-weights[i-1]] + values[i-1] represents the value if you choose to put the item in your knapsack.
    
    # Returns the maximum value of the knapsack
    
    # _0_1_knapsack :: [Int] -> [Int] -> Int -> Int
    def _0_1_knapsack(values, weights, capacity):
    
        number_of_items = len(values)
        table = []
    
        for i in range(number_of_items):
            table.append([None] * capacity)
    
        for j in range(capacity):
            table[0][j] = 0
    
        for i in range(1, number_of_items):
            for j in range(capacity):
                if weights[i-1] > j:
                    table[i][j] = table[i-1][j]
                else:
                    table[i][j] = max(table[i-1][j], table[i-1][j-weights[i-1]] + values[i-1])
    
        return table[-1][-1]
    
    values  = [3,5,1,2,4,6,3,2,4]
    weights = [1,2,4,6,1,6,2,3,4]
    capacity = 10
    
    print(_0_1_knapsack(values, weights, capacity)) # 17
    
  • Iterative Tree Traversals in Python

    Tree traversals are most naturally expressed in recursion, but iterative versions are cool too, plus they take only O(1) space.

    Inorder traversal: Visit the left subtree first, then the node, and the right subtree.
    Preorder traversal: Visit the node first, then the left subtree, then the right subtree.
    Postorder traversal: Visit the left subtree, then the right subtree, then the node.

    The concept behind the iterative versions are as follows.

    There are three states a traversal can be in:

    • You’ve just visited the left or right child of a parent node.
    • You’ve just gone back to a parent node from its left child.
    • You’ve just gone back to a parent node from its right child.

    Keeping three pointers: prev to designate the previous node, curr to designate the current node, and _next to designate the next node, we can codify the above conditions like so:

    if not prev or prev.left == curr or prev.right == curr:
      # first condition
    elif curr.left == prev:
      # second condition
    else: # curr.right == prev
      # third condition
    

    With that in mind, I present the three different traversals, whose function signatures take a BTreeNode as the first argument and a function to operate on the tree nodes as the second argument.

    class BTreeNode:
        def __init__(self, data):
            self.data = data
            self.parent = None
            self.left = None
            self.right = None
    
        def __str__(self):
            return str(self.data)
    
    a = BTreeNode(6)
    b = BTreeNode(4)
    c = BTreeNode(5)
    d = BTreeNode(3)
    e = BTreeNode(2)
    f = BTreeNode(1)
    
    a.left = b
    a.right = c
    
    b.parent = a
    b.left = d
    b.right = e
    
    c.parent = a
    c.left = f
    
    d.parent = b
    
    e.parent = b
    
    f.parent = c
    
    def iterativeInOrder(root, func):
        if not root:
            return
    
        prev = None
        curr = root
        _next = None
    
        while curr:
            if not prev or prev.left == curr or prev.right == curr:
                if curr.left:
                    _next = curr.left
                else:
                    func(curr)
                    _next = curr.right if curr.right else curr.parent
    
            elif curr.left == prev:
                func(curr)
                _next = curr.right if curr.right else curr.parent
    
            else:
                _next = curr.parent
    
            prev = curr
            curr = _next
    
    def iterativePreOrder(root, func):
        if not root:
            return
    
        prev = None
        curr = root
        _next = None
    
        while curr:
            if not prev or prev.left == curr or prev.right == curr:
                func(curr)
                if curr.left:
                    _next = curr.left
                else:
                    _next = curr.right if curr.right else curr.parent
    
            elif curr.left == prev:
                _next = curr.right if curr.right else curr.parent
    
            else:
                _next = curr.parent
    
            prev = curr
            curr = _next
    
    def iterativePostOrder(root, func):
        if not root:
            return
    
        prev = None
        curr = root
        _next = None
    
        while curr:
            if not prev or prev.left == curr or prev.right == curr:
                if curr.left:
                    _next = curr.left
                elif curr.right:
                    _next = curr.right
                else:
                    func(curr)
                    _next = curr.parent
    
            elif curr.left == prev:
                if curr.right:
                    _next = curr.right
                else:
                    func(curr)
                    _next = curr.parent
    
            else:
                func(curr)
                _next = curr.parent
    
            prev = curr
            curr = _next
    
    iterativeInOrder(a, print)   # 3 4 2 6 1 5
    iterativePreOrder(a, print)  # 6 4 3 2 5 1
    iterativePostOrder(a, print) # 3 2 4 1 5 6
    
  • Jekyll Variable in SCSS Files

    You can use Jekyll site variables in top-level SCSS files. For example:

    $baseurl: '{{ site.baseurl }}';
    
    @import 'fonts';
    

    These site variables will also visible in partials that are imported after it:

    // _fonts.scss
    
    @font-face {
      src: url($baseurl + "/assets/fonts/font.eot");
      ...
    }
    
  • Custom Liquid Tags

    Coming back to Jekyll from Hugo, I’d grown accustomed to shortcodes, which are awesome to keep your Markdown source as HTML-free as possible. You can emulate shortcodes with custom liquid tags!

    In your _plugins folder, create a new file, and then write a class that inherits from Liquid::Tag. Let’s see an example below of a tag for embedding Youtube videos:

    class YoutubeTag < Liquid::Tag
      def initialize(tag_name, arg, tokens)
        super # mandatory
        # fill in some awesome codez
      end
    
      def render(context)
        <<-MARKUP.strip
        # more awesome codez
        MARKUP
      end
    end
    
    # register the tag
    Liquid::Template.register_tag('youtube', MyTag)
    

    The general thing to take note of here is that whatever appears after the tag name will be stringified and sent to the second argument of initialize, so for example, a custom tag that looks like:

    
    {% my_tag lol this is a tag %}
    
    

    args in the initialize method above will be lol this is a tag (note the trailing whitespace). There’s no further treatment, so you will have to parse the string yourself somehow to identify what is what, and then assign what you need to instance variables:

    def initialize(tag_name, arg, tokens)
      super
      @youtube_id, @width, @height = arg.split
    end
    

    The class must also implement a render method, which returns a string representation of the HTML, so in this case of Youtube embeds:

    def render(context)
      <<-MARKUP.strip
        <iframe id="ytplayer" type="text/html" width="#{@width}" height="#{@height}" src="http://www.youtube.com/embed/#{@youtube_id} frameborder="0"/>
      MARKUP
    end
    

    And voila! You can now use your Youtube tag like so:

    
    {% youtube M7lc1UVf-VE 600 400 %}
    
    

    Great success 👍