#includevoid quick_sort (int *a, int n) { int i, j, p, t; if (n < 2) return; p = a[n / 2]; for (i = 0, j = n - 1;; i++, j--) { while (a[i] < p) i++; while (p < a[j]) j--; if (i >= j) break; t = a[i]; a[i] = a[j]; a[j] = t; } quick_sort(a, i); quick_sort(a + i, n - i); }
It is not hard to fix the problem by counting the number of the pivot value. Just return when observing all the numbers are same.
While the data set is huge and the partitions are getting smaller with a few call of quick_sort, it has high probability that all the values in one small partition are the same. For some situations even worse, like sorting all the people in USA according to the age. If the input is an array of records in database systems, it is really costly by doing useless memory access/copies in nlog(n) scope!!!
Human's age are values from 1 to 100 something. If blindly using the quick_sort algorithms, it just eats the CPU and memory for no good. In many cases the keys to be sorted have just a few unique values like the age. In these cases, instead of using a nlog(n) quicksort algorithm, a hybrid hash/count sort can do a much better job by one or a few scans. First extract the unique keys from the input, sort/hash them, do a final scan by counting or rearranging inputs. One can do optimization by just doing one scan depending the situations.
No comments:
Post a Comment