算法导论Lecture 5:线性时间排序
How fast can we sort? (Depends on the sorting model: what you can do with the elements)
?
Comparison sorts: only use comparisons to determine relative order of elements:
?
quicksort - Theta(nlgn) randomized version
heapsort - Theta(nlgn)
merge sort - Theta(nlgn)
insertion sort - Theta(n^2)
?
Any comparison sorting algorithms require at least?Omega(nlgn) comparisons in the worst case: proof by decision tree model, height h?of the decision tree corresponds to the worst case time, each of the n! permutations of the input appears as some leaf, n! < 2^h, thus h = Omega(nlgn).
?
Heap sort and merge sort are asymptotically optimal comparison sorts (worst-case also have upper bound nlgn).
?
Counting sort
?
Assume each of the n input elements is an integer in the range 0 to k. If k=O(n), the sort runs in Theta(n) time.
?
Basic idea: for each input element x determine number of elements less than x.
?
Input: A[1..n], each A[i] is in the range 0 to k.
Output: B[1..n] = sorted A
Temp storage: C[0..k] for counting, k is the range.
?
COUNTING-SORT(A, B, k)for i := 0 to k // O(k) do C[i] := 0for j := 1 to length[A] //O(n) do C[A[i]] := C[A[j]] + 1for i := 1 to k // O(k) do C[i] := C[i] + C[i - 1]for j := length[A] downto 1 //O(n) do B[C[A[j]]] := A[j] C[A[j]] := C[A[j]] - 1
?
So the running time is O(n+k), if k = O(n) then O(n) time. Counting sort is only effective to integers with small range, meaning that k is small is a requirement.
?
Another important property of counting sort: Stability, which is used in the following algorithm: radix sort.
?
Stable sort preserves the relative order of equal elements.
?
[Q] Which other sorts are stable? http://en.wikipedia.org/wiki/Stable_sort#Stability
?
?
Radix?sort [Herman Hollerith -1890], effective to large range integers. Idea is to sort on the least significant digits of those integers with a stable sorting algorithm, say, the counting sort.
?
[Ex].??????? ? ?? 3 2 9?????????????????????? 7 2 0?????????????????????? 7 2 0??????????????????????3 2 9?
4 5 7?????????????????????? 3 5 5?????????????????????? 3 2 9??????????????????????3 5 5?
6 5 7?????????????????????? 4 3 6?????????????????????? 4 3 6??????????????????????4 3?6?
8 3 9????? ?? =>???????? 4 5 7???????? =>???????? 8 3 9???????? =>?????????4 5 74 3 6?????????????????????? 6 5 7?????????????????????? 3 5 5??????????????????????6 5 7?
7 2 0?????????????????????? 3 2 9?????????????????????? 4 5 7??????????????????????7 2 0?
3 5 5?????????????????????? 8 3 9?????????????????????? 6 5 7??????????????????????8 3 9?
?
Proof the correctness: use induction on digit t.
?
1. assume by induction the array is sorted on low-order t-1 digits.
2. sort on digit t:
??? - if two elements have same digits on the position t, after sorting?they have
??????the same order?? because of stability.
??? - if no same digits, they are just?sorted.
So the induction hypothesis is true.
?
In practice, integers or other stuffs in computer world are binary bits, so group them as digits, for example, 32 bits integers can be grouped as four bits digits or bytes, etc.
?
Analysis:?
?
- if using counting sort as the stability requirement, we need O(k+n) for each round (digit).
- suppose sorting n integers, each is b bits, so the range is [0, 2^b - 1].
- split integer into b/r "digits", each r bits, thus each digit ranges in [0, 2^r - 1], k?= 2^b.
??The counting sort runs b/r rounds.
?
Time: O(b/r * (n + k))
???? = O(b/r * (n + 2^r))
?
so r is choosed?properly to minimize the time: r = lgn? =>
Time: O(bn/lgn).