Recursive Algorithms - The Dictionary Search Problem

Suppose we are given a problem to find a word in a dictionary. This is known as a dictionary search. Suppose the word is “yeoman”. You could start at the first page and look for it and then try the second page. then the third, and so on, until finally you reach the page that contains the word. This is called a sequential search. Of course no one in their right mind would do this because everyone knows that dictionaries are sorted alphabetically, and that therefore there is a faster way that takes advantage of the fact that they are sorted. A dictionary by definition has the property that the words are listed in it in alphabetical order, which in computer science means it is a sorted list.

One more efficient solution might be to use binary search, which is described by the following recursive algorithm:

if ( the set of pages is just one page ) {
    scan the page for the word
} else {
    open the dictionary to the page halfway between the first and last page of the set of pages being searched;
    compare the word to the word at the top of the page;

    if ( the word precedes the index word on the page alphabetically ) {
        search the lower half of the set of pages using this same algorithm
    } else {
        search the upper half of the set of pages using this same algorithm

The recursion in this algorithm occurs in lines search the lower half... and lines search the upper half... search the upper half, in which the instructions state that we must use this algorithm again, but on a smaller set of pages. The algorithm basically reduces the problem to one in which it compares the word being sought to a single word, and if it is “smaller”, it looks for the word in the first half, and if it is larger, it looks in the second half. When it does this “looking again”, it repeats this exact logic. This approach to solving a problem by dividing it into smaller identical problems and using the results of conquering the smaller problems to conquer the large one is called divide-and-conquer.

Divide-and-conquer is a problem-solving paradigm, meaning it is a model for solving many different types of problems.

The binary search algorithm will eventually stop because each time it checks to see how many pages are in the set, and if the set contains just one page, it does not do the “recursive part” but instead scans the page, which takes a finite amount of time. It is easier to see this if we write it as a pseudo-code function:

void binary_search(dictionary_type dictionary, word keyword) {
  if (the set of pages in dictionary has size = 1) {
    // This is called the base case
    scan the page
    for keyword;
    if (keyword is on the page)
      print keyword's definition;
      print "not in dictionary";
  } else { // the size > 1
    let middlepage be the page midway between the first and 
                last pages of the set of pages being searched; // e.g middlepage=(first+last)/2
    compare keyword to indexword at the top of middlepage;
    if (keyword < indexword)
      // recursive invocation of function
      binary_search(lower half of dictionary, keyword);
      // recursive invocation of function
      binary_search(upper half of dictionary including middlepage, keyword);
  1. This function calls itself recursively in two places.
  2. When it calls itself recursively, the size of the set of pages passed as an argument is at most one-half the size of the original set.
  3. When the size of the set is 1, the function terminates without making a recursive call. This is called the base case of the recursion.
  4. Since each call either results in a recursive call on a smaller set or it terminates without making a recursive call, the function must eventually terminate in the base case code.
  5. If the keyword is on the page checked in the base case, the algorithm will print its definition, otherwise it will say it is not there. This is not a formal proof that it is correct!

If a recursive function has these two properties:

  • that each of its recursive calls diminishes the problem size, and
  • that the function takes a finite number of steps when the problem size is less than some fixed constant size,

then it is guaranteed to terminate eventually. Later we will see a more general rule for guaranteeing that a recursive function terminates.

Licenses and Attributions

Speak Your Mind