Algorithm Theory I - The Master Theorem

So far we've looked at two types of sorting algorithm: iterative algorithms like bubble sort and recursive algorithms like merge sort.

Recursive algorithms such as merge sort are examples of divide-and-conquer algorithms. This means that they work by breaking problems down into smaller steps which can be computed individually. This gives merge sort some interesting properties:

It can be written as a recursive function
This function can be distributed across multiple systems by computing smaller sub-problems individually
We could cache (memoise) the results of sub-lists to improve the time efficiency of the algorithm.

Let's look at the implications of this for the time complexity of the algorithm in a bit more detail.

Merge sort splits any given list in half and then merges the resulting sorted lists. In T-notation, we can write this as T(n) = 2T(n/2) + O(n).

Merge sort also has a base case, which is when there's only one element in the list (a list with 1 element is always sorted). This always completes in O(1) time, since it does nothing except return the input.

2T(n/2) is the time taken to perform the same algorithm (i.e. merge sort) on both sub-lists of the original list. n is the time taken to merge the existing list. However, as we don't know the exact number of operations it takes, we just write it as O(n).

However, it's not clear how this relates to the time complexity of merge sort, which is O(nlogn). We can expand the function for some value of n, say 4:

T(4) = 2T(4/2) + 4
T(4) = T(2) + T(2) + 4
T(4) = (2T(2/2) + 2 + 2 + 4
T(4) = T(1) + T(1) + 2 + 2 + 4
T(4) = 1 + 1 + 2 + 2 + 4
T(4) = 10

But just expanding the function is not very helpful, takes a lot of working, and doesn't let us arrive at the general runtime for merge sort in a rigorous fashion.

By using the Master Method, we can go directly from our original T notation to big O notation for recursive functions in just the same way as we can with bubble or insertion sort.

The Master Method

It turns out we can give a general equation for any recursive algorithm where the subproblems are of the same form.

This equation is T(n) = aT(n/b)+O(n^d)

There are a few additional variables: a, b, and d.

a is the number of subproblems per problem (i.e. the number of times we call our recursive function from itself).
b is the factor by which we divide the input. It's often, but not always, the same as a.
d is the time complexity of the operations we complete in each problem.

To use the Master Method we must also know the base case of the problem. In merge sort's case this is O(1), as above.

Here's the Python for merge sort. What are the values of a, b, and d? What does this make the formula for T(n)?

def merge_sort(array):
  if len(array) == 1:
    return array

  mid = floor(len(array) / 2)
  left = merge_sort(array[:mid])
  right = merge_sort(array[mid:])

  left_i = 0
  right_i = 0
  final = []

  while left_i < len(left) and right_i < len(right):
    if left[left_i] < right[right_i]:
      final.append(left[left_i])
      left_i += 1
    else:
      final.append(right[right_i])
      right_i += 1

  while left_i < len(left):
    final.append(left[left_i])
    left_i += 1

  while right_i < len(right):
    final.append(right[right_i])
    right_i += 1

  return final

The values are:

a = 2
b = 2
d = 1
T(n) = 2T(n/2) + O(n^1)

Exercises

Give a formula expressing the time complexity of each of the following Python algorithms in T notation.

def A(input):

  A(input[: len(input) / 2])
  A(input[len(input) / 2 :])

  for i in input:
    for j in input:
      i + j

  return A

def B(input):
  third = len(input) / 3

  B(input[: third])
  B(input[third : third * 2])

  return sum(input)

def C(input, target):
  if len(input) == 1:
    return target == input[0]

  mid = len(input) / 2
  left = input[:mid]
  right = input[mid:]

  if C(left) or C(right): return True

  return False

The Master Theorem

The Master Theorem involves a number of observations about the nature of recursive algorithms. It has 3 cases in which it applies and a null case in which it does not.

All three cases are to do with the relationship between the time complexity of the computation for a problem and the time complexity of its subproblems. You'll notice that each case is drawn directly from the original formula: T(n) = aT(n/b)+O(n^d)

Case 1: When the time complexity of subproblems is equal to the time complexity of the main problem.

a = b^d => 
T(n) = O((n^d)logb(n))

Case 2: When the time complexity of subproblems is less than the time complexity of the main problem.

a < b^d => 
T(n) = O(n^d)

Case 3: When the time complexity of subproblems is greater than the time complexity of the main problem.

a > b^d => 
T(n) = O(n^logb(a))

Let's apply this to merge sort:

a = 2
b = 2
d = 1
T(n) = 2T(n/2) + O(n^1)

b^d = 2

a = b^d, therefore Case 1 applies.

Therefore the time complexity of Merge Sort is:

T(n) = O(n^dlogn)
T(n) = O(n^1 * log(n))
T(n) = O(nlogn)

Which is exactly what we found by looking at the algorithm intuitively in the last class!

Exercises

For each of the functions above, write out the runtime of that algorithm in Big O notation.

Additional Resources

For a full proof of the Master Theorem and a more complex statement of its terms see Stanford CS161 Lecture 3 - The Master Theorem

Answers to Exercises

function A:
a = 2
b = 2
d = 2
T(n) = 2T(n/2) + O(n^2)
b^d = 4
a < b^d, so case 2 applies.
T(n) = O(n^d)
T(n) = O(n^2)

function B:
a = 2
b = 3
d = 1
T(n) = 2T(n/3) + O(n)
b^d = 3
a < b^d, so case 2 applies.
T(n) = O(n)

function C:
a = 2
b = 2
d = 0
T(n) = 2T(n/3) + O(1)
a > b^d, so case 3 applies.
T(n) = O(n^logb(a))
T(n) = O(n^log2(2))
T(n) = O(n^1)
T(n) = O(n)