Books / Sorting Algorithms in Computer Science / Heap Sort
Heap Sort
Overview of Heap Sort
The idea underlying heapsort is to convert the array to be sorted into a maximum heap in O(n) steps, and to use this heap to sort the data efficiently. A maximum heap (or max-heap) is a heap in which the heap order property is reversed - each node is larger than its children, not smaller. Having converted the array into a maximum heap, the algorithm repeatedly swaps the maximum element with the one at the end of the working array, decrements the size of the working array by one element afterward. This way, the next swap is with the element that immediately precedes the one just swapped.
Heap Sort Algorithm
The heap sort algorithm builds the maximum heap, and then pretends to delete the maximum element by swapping it with the element in the last position, and decrementing the variable that defines where the heap ends in the array, effectively treating the largest element as not being part of the heap anymore.
The Heapsort algorithm involves preparing the list by first turning it into a max heap. The algorithm then repeatedly swaps the first value of the list with the last value, decreasing the range of values considered in the heap operation by one, and sifting the new first value into its position in the heap. This repeats until the range of considered values is one value in length.
The steps are:
- Call the buildMaxHeap() function on the list. Also referred to as heapify(), this builds a heap from a list in O(n) operations.
- Swap the first element of the list with the final element. Decrease the considered range of the list by one.
- Call the siftDown() function on the list to sift the new first element to its appropriate index in the heap.
- Go to step (2) unless the considered range of the list is one element.
The buildMaxHeap()
operation is run once, and is O(n) in performance. The siftDown()
function is O(log n), and is called n times. Therefore, the performance of this algorithm is O(n + n log n) = O(n log n).
procedure heapsort(a, count) is input: an unordered array a of length count (Build the heap in array a so that largest value is at the root) heapify(a, count) (The following loop maintains the invariants that a[0:end] is a heap and every element beyond end is greater than everything before it (so a[end:count] is in sorted order)) end ← count - 1 while end > 0 do (a[0] is the root and largest value. The swap moves it in front of the sorted elements.) swap(a[end], a[0]) (the heap size is reduced by one) end ← end - 1 (the swap ruined the heap property, so restore it) siftDown(a, 0, end)
The sorting routine uses two subroutines, heapify
and siftDown
. The former is the common in-place heap construction routine, while the latter is a common subroutine for implementing heapify
.
(Put elements of 'a' in heap order, in-place) procedure heapify(a, count) is (start is assigned the index in 'a' of the last parent node) (the last element in a 0-based array is at index count-1; find the parent of that element) start ← iParent(count-1) while start ≥ 0 do (sift down the node at index 'start' to the proper place such that all nodes below the start index are in heap order) siftDown(a, start, count - 1) (go to the next parent node) start ← start - 1 (after sifting down the root all nodes/elements are in heap order) (Repair the heap whose root element is at index 'start', assuming the heaps rooted at its children are valid) procedure siftDown(a, start, end) is root ← start while iLeftChild(root) ≤ end do (While the root has at least one child) child ← iLeftChild(root) (Left child of root) swap ← root (Keeps track of child to swap with) if a[swap] < a[child] then swap ← child (If there is a right child and that child is greater) if child+1 ≤ end and a[swap] < a[child+1] then swap ← child + 1 if swap = root then (The root holds the largest element. Since we assume the heaps rooted at the children are valid, this means that we are done.) return else swap(a[root], a[swap]) root ← swap (repeat to continue sifting down the child now)
Analysis of Heap Sort (Time and Space Complexity)
Experiments have shown that heapsort is consistent and averages only slightly fewer comparisons than its worst case.
- Worst case time complexity: O(N Log N)
- Average performance: O(N Log N)
- Space complexity: O(1)
More details are below.
Building the heap takes O(n) steps (to be precise, 2n steps. In the second stage of the algorithm, the swap takes constant time, but the siftDown
step, in the worst case, will require a number of comparisons and moves in proportion to the current height of the heap. The \(j^{\text {th }}\) iteration uses at most \(2 \cdot \log (N-j+1)\) comparisons. We have
It can be proved that \(\log (N !)=\Theta(N \log N){ }^{2}\), so heapsort in the worst case has a running time that is \(\Theta(N \log N)\). A more accurate analysis will show that the most number of comparisons is \(2 N \log N-O(N)\).
Common Heap Sort Questions
1. Is heap sort in-place sorting algorithm?
Yes. Heap sort is an in-place sorting algorithm requiring O(1) space complexity.
2. Is heap sort adaptive?
No. In adaptive algorithms, the complexity (running time, performance) changes based on whether or not the input array is sorted. They perform better when the input is sorted. Heap sort is not adaptive, but a variation called adaptive heap sort exists.
3. Is heap sort stable?
No. Stable means that if two elements have the same value, they remain in the same position. Heap sort is not stable. Because actions in the heap might modify the relative order of value, heap sort is not stable.
4. Is heap sort divide and conquer?
No. A divide and conquer algorithm continuously breaks down a bigger problem into sub-problems, until they become simple enough to be solved directly. Heap sort is not divide and conquer because it doesn’t break down the sub-divide the input array into smaller arrays.