Tuesday 4 April 2017

Chapter 2 Exercise 2.3, Introduction to Algorithms, 3rd Edition Thomas H. Cormen

2.3-1 Using Figure 2.4 as a model, illustrate the operation of merge sort on the array A=<3,41,52,26,38,57,9,49>

Solution:

2.3-2 Rewrite the MERGE procedure so that it does not use sentinels, instead stopping once either array L or R has had all its elements copied back to A and then copying the remainder of the other array back into A.

Solution:


MERGE(A, p, q, r)
 n1 = q - p + 1
 n2 = r - q
 create arrays L[1..n1] and R[1..n2]
 for i = 1 to n1
   L[i] = A[p + i - 1]
 for j = 1 to n2
   R[j] = A[q + j]
 i = 1
 j = 1
 for k = p to r
   if i > n1
    A[k] = R[j]
    j = j + 1
   else if j > n2
    A[k] = L[i]
    i = i + 1
   else if L[i]  R[j]
    A[k] = L[i]
    i = i + 1
   else
    A[k] = R[j]
    j = j + 1

2.3-3 Use mathematical induction to show that when n is an exact power of 2, the solution of the recurrence
$$T(n) = \begin{cases} 2 & \text{if } n = 2, \\ 2T(n/2) + n & \text{if } n = 2^k, \text{for } k > 1. \end{cases}$$
is $$T(n) = n\lg{n}$$

Solution:

First, let's establish a function on which we're going to perform induction:
$$F(k) = T(2^k)$$
We want to prove that:
$$F(k) = 2^k \lg{2^k}$$
Base. It is simple enough:
$$F(1) = T(2) = 2 = 2\lg2 = 2^1\lg{2^1}$$
Step. We assume that:
$$F(k) = 2^k \lg{2^k}$$
We prove it for k+1:
$$\begin{align} F(k + 1) & = T(2^{k+1})= 2T(\frac{2^{k+1}}{2}) + 2^{k+1} = \\ & = 2T(2^k) + 2^{k+1} = 2 \cdot 2^k \lg{2^k} + 2^{k+1} = \\ & = 2^{k+1}(\lg{2^k} + 1) = 2^{k+1}(\lg{2^k} + \lg2) = \\ & = 2^{k+1}\lg{2^{k+1}} \end{align}$$

2.3-4 We can express insertion sort as a recursive procedure as follows. In order to sort. In order to sort A[1..n], we recursively sort A[1..n−1] and then insert A[n] into the sorted array A[1..n−1]. Write a recurrence for the running time of this recursive version of insertion sort.

Solution:

There are two steps in this recursive sorting algorithm:
  1.  Sort the sub-array A[1..n−1]
  2. Insert A[n] into the sorted sub-array from step 1 in proper position
For n=1, step 1 doesn’t take any time as the sub-array is an empty array and step 2 takes constant time, i.e. the algorithm runs in Θ(1) time.
For n>1, step 1 again calls for the recursion for n−1 and step 2 runs in Θ(n) time.

So, we can write the recurrence as:
$$% <![CDATA[ T(n) = \begin {cases} \Theta(1) & \text { if } n = 1, \\ T(n - 1) + \Theta(n) & \text { if } n > 1 \end {cases} %]]>$$

Solution to the Recurrence: Let us assume that for n=1, T(n)=c1, where c1 is some constant. 

And on average for n>1, inserting an element in its proper position in a sorted array requires shifting half of the elements, i.e. c2n/2+c3 time (c2n/2 for shifting the elements and c3

for inserting the element).

So, we can rewrite the recurrence as:
$$% <![CDATA[ T(n) = \begin {cases} c_1 & \text { if } n = 1, \\ T(n - 1) + c_2(n - 1)/2 + c_3 & \text { if } n > 1 \end {cases} %]]>$$
So for any general n>1,
$$% <![CDATA[ \begin {align} T(n) & = T(n - 1) + c_2(n - 1)/2 + c_3 \\ & = T(n - 2) + c_2(n - 2)/2 + c_3 + \{c_2(n - 1)/2 + c_3\} \\ & = T(1) + \cdot \cdot \cdot + \{c_2(n - 2)/2 + c_3\} + \{c_2(n - 1)/2 + c_3\} \\ & = c_1 + \frac {c_2} 2 \{1 + 2 + \cdot \cdot \cdot + (n - 1)\} + c_3(n - 1) \\ & = c_1 + \frac {c_2} 2 \cdot \frac {n(n - 1)} 2 + c_3(n - 1) \\ & = \Theta(n^2) \end {align} %]]>$$

2.3-5 Referring back to the searching problem (see Exercise 2.1-3), observe that if the sequence A is sorted, we can check the midpoint of the sequence against and eliminate half of the sequence from further consideration. The binary search algorithm repeats this procedure, halving the size of the remaining portion of the sequence each time. Write pseudocode, either iterative or recursive, for binary search. Argue that the worst-case running time of binary search is Θ(lgn).

Solution:

Pseudo-code:

ITERATIVE-BINARY_SEARCH(A, v)
 low = A[1]
 high = A[A.length]
 while low &le; high
  mid = (low + high) / 2
  if v == A[mid]
   return mid
  elseif v > A[mid]
   low = mid + 1
  else
   high = mid - 1
 return NIL

RECURSIVE-BINARY-SEARCH(A, v, low, high)
 if low > high
    return NIL
 mid = (low + high) / 2
 if v == A[mid]
  return mid
 elseif v > A[mid]
  RECURSIVE-BINARY-SEARCH(A, v, mid + 1, high)
 else
  RECURSIVE-BINARY-SEARCH(A, v, low, mid - 1)

Intuitively, in worst case, i.e. when v is not at all present in A, we need to binary search over the whole array to return NIL. So in worst-case, we need to repeatedly divide the array by 2. So the running time is nothing but how many times the input size can be divided by 2, i.e. lgn.

And using recurrence formula, we can say running time T(n)=T((n−1)/2)+Θ(1)=Θ(lgn).

2.3-6 Observe that the while loop of lines 5–7 of the INSERTION-SORT procedure in Section 2.1 uses a linear search to scan (backward) through the sorted subarray A[1..j−1]. Can we use a binary search (see Exercise 2.3-5) instead to improve the overall worst-case running time of insertion sort to Θ(nlgn)?

Solution:
Insertion sort has a loop like this

while i > 0 and A[i] > key
    A[i + 1] = A[i]
    i = i - 1

This loop serves two purposes:
  1. A linear search to scan (backward) through the sorted sub-array to find the proper position for key.
  2. Shift the elements greater than key towards the end to insert key in the proper position.
Although we can reduce the number of comparisons by using binary search to accomplish purpose 1, we still need to shift all the elements greater than key towards the end of the array to insert key. And this shifting of elements runs at Θ(n) time, even in average case (as we need to shift half of the elements). So, the overall worst-case running time of insertion sort will still be Θ(n2).

2.3-7 Describe a Θ(nlgn)-time algorithm that, given a set S of n integers and another integer x, determines whether or not there exist two elements in S whose sum is exactly x.

Solution:

First we sort S. Afterwards, we iterate it and for each element i we perform a binary search to see if there is an element equal to x−i. If one is found, the algorithm returns true. Otherwise, it returns false.

We iterate n elements and perform binary search for each on in an n-sized array, which leads to Θ(nlgn)-time. If we sort first (with merge sort) it will introduce another Θ(nlgn) time, that would change the constant in the final algorithm, but not the asymptotic time.
PAIR-EXISTS(S, x):
  A = MERGE-SORT(S)

  for i = 1 to S.length
      if BINARY-SEARCH(A, x - S[i]) != NIL
          return true

  return false