Books / Memoization / Chapter 1

Introduction to Memoization

Memoization is a technique for improving the performance of a certain category of algorithms. It works by saving and reusing previously computed results rather than recomputing them.

Memoization is an optimization technique to make algorithms run faster with space-time tradeoff: it uses extra space (memory) to improve run time. Algorithms using memoization store intermediate results in an array or a map often called the memoization table. The previously computed results are retrieved later when they are needed again. This eliminates the cost of recalculation and makes the algorithm run faster.

Memoization Pseudocode

The key idea of Memoization is to remember previously computed results and reuse them.

  • When the function is called with certain arguments:
    • Lookup the memoization table (hashmap) to check if the result already exists
      • If the result exists, return it to caller and exit
      • If the result does not exist, compute it, store it in the memoization table, return it to caller and exit

Memoization Use Case

There’s a certain class of algorithms that work by breaking up a bigger problem into smaller sub-problems. They then calculate solutions to these sub-problems and combine to build up the final solution.

In such algorithms, when sub-problems overlap or reoccur, memoization can be used to store results of sub-problems to avoid calculating solution to any sub-problem twice.

If an algorithm doesn’t have overlapping sub-problems, memoization will not improve its performance.

Memoization in Action

You have probably heard of the Fibonacci numbers. It is sequence of numbers that starts at 0, 1, 1 and then each subsequent number is calculated as the sum of the two preceding numbers. So we get 0, 1, 1, 2 because (1+1=2) and then 0, 1, 1, 2, 3 because (1+2=3) and so on.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

Let’s write a program to compute the nth Fibonacci number. We know the equation looks like this:

fib(n) = fib(n-1) + fib(n-2) for n > 1

To calculate the nth Fibonacci, we can break it into two sub-problems: fib(n-1) and fib(n-2).

For example, to calculate the 7th Fibonacci number, using the above equation, we get:

fib(7) = fib(6) + fib(5)

How do we get to fib(6) and fib(5)? We can break these up again and use recursion to calculate these:

fib(6) = fib(5) + fib(4)

and,

fib(5) = fib(4) + fib(3)

Fibonacci Numbers Using Recursion

Let’s write a simple algorithm in JavaScript to calculate the Fibonacci numbers using recursion:

function fibonacci(num) {
    if (num <= 2) { 
        return 1
    }
    
    return fibonacci(num-1) + fibonacci(num-2) // recursive call
}

let num = 7

for (let i = 1; i<=num; i++) {
    console.log("fibonacci(" + i + ") = " + fibonacci(i));
}

Our program starts at 1 and keeps calling the fibonacci(num) function in a loop to calculate Fibonacci numbers of 1 through num.

The function fibonacci(num) uses recursion to calculate the nth Fibonacci number:

fibonacci(num-1) + fibonacci(num-2)

Let’s visualize our algorithm for 7th Fibonacci number:

7th Fibonacci number tree diagram

Tree diagram for calculating the 7th Fibonacci number.

In the diagram above, we can make a key observation:

  • We can clearly see overlapping sub-problems:fib(5) and fib(4) are evaluated twice and fib(3) is evaluated 3 times (not shown in the diagram).

In other words, each call to fibonacci(num) gives birth to two more function calls and there’s a lot of repetitive calculations.

For bigger numbers, fibonacci(num) will be dreadfully slow. Computing 40th Fibonacci number will result is:

  • 8 calls to fibonacci(35)
  • 13 calls to fibonacci(34)
  • 21 calls to fibonacci(33)

Our algorithm is clearly not at all efficient. But as we’ve been learning in tutorial, it has repeated sub-problems. Let’s improve its performance using Memoization in the next section.

Fibonacci Numbers Using Recursion with Memoization

As we noted in the last section, our Fibonacci number algorithm is not very efficient. One pattern that noted is that it recalculates results to sub-problems many time. As we have learned, this is where we can apply the Memoization technique to optimize our algorithm and improve its performance.

const fibonacciWithMemoization = function(num) {
  const memoTable = []; // Memoization table for storing sub-problems / intermediate results

  function fibonacci(num) {
    if (num <= 2) {
        return 1
    }

    // check if the num exists in the memoization table
    if (memoTable[num]) {
        return memoTable[num]; // return num
    }

    memoTable[num] = fibonacci(num-1) + fibonacci(num-2) // recursive call
    return memoTable[num];
  }

  return fibonacci(num);
};

let num = 7
for (let i = 1; i<=num; i++) {
    console.log("fibonacciWithMemoization(" + i + ") = " + fibonacciWithMemoization(i));
}

Time Complexity Analysis

Run the two algorithms above to find the 38th Fibonacci number and notice the CPU time. Here are the execution times:

  • Fibonacci without Memoization: 3.24 seconds
  • Fibonacci with Memoization: 0.06 seconds

That’s a huge improvement. We algorithm with simple Memoization ran 98% faster! In fact, using Memoization, we reduced the time complexity from O(2^n) to linear time complexity O(N).

The original Fibonacci algorithm had an exponential time complexity of O(2^n). This is because each node has makes two call. So we start at 1 level, then 2, then 4 and so forth. So it’s 2*2*2... = 2^n. This is what the tree looks like for 5th Fibonacci number.

5th Fibonacci number tree diagram

With Memoization, we are reusing results from previous runs so our tree looks like this:

7th Fibonacci number tree diagram

The ones with underline are just returning memoized results. You can see that the tree is linear and hence the time complexity is O(N).

Conclusion

Memoization is an optimization technique for improving the performance of recursive algorithms without having to change the way the algorithm works. Memoized algorithms store results of smaller sub-problems and ensure that no sub-problem is calculated twice. It is very useful in the context of Dynamic Programming.


Licenses and Attributions


Speak Your Mind