1 00:00:02,337 --> 00:00:04,554 Hello everyone, in this lesson of the Data Structures and Algorithms courses, 2 00:00:04,944 --> 00:00:06,877 We will make a summary for Chapter 8, internal sorting, mainly about 3 00:00:12,124 --> 00:00:13,902 the time cost of these sorting algorithm. 4 00:00:16,373 --> 00:00:20,438 Firstly, we learn the time cost of these simple sorting algorithms. 5 00:00:20,968 --> 00:00:27,003 Then we make some analysis and comparisons for these sorting algorithm theoretically and experimentally. 6 00:00:27,482 --> 00:00:32,471 Finally, we will learn about the lower bounds for sorting. 7 00:00:34,584 --> 00:00:37,338 The essence of sorting, 8 00:00:37,338 --> 00:00:41,737 in fact, is for a random input sequence to be sorted, 9 00:00:41,828 --> 00:00:47,652 swapping those records in reverse order among them. 10 00:00:48,203 --> 00:00:51,402 When we deal with all the records in reverse order, 11 00:00:52,112 --> 00:00:55,034 we finish the algorithm. 12 00:00:56,187 --> 00:00:58,886 For these simple algorithm, 13 00:00:59,408 --> 00:01:05,412 Their main ideas are dealing with records and its neighbour one by one. 14 00:01:05,868 --> 00:01:09,445 So they are inefficient. 15 00:01:09,913 --> 00:01:15,289 Therefore, totally speaking, its time cost 16 00:01:15,289 --> 00:01:20,967 equals a two-layer loop for the array, which means Θ(n^2). 17 00:01:21,910 --> 00:01:26,242 Now we look at the time cost of these simple methods. 18 00:01:26,242 --> 00:01:31,413 They are all Θ(n^2) for the cases of average and maximum. 19 00:01:31,413 --> 00:01:38,433 For the direct insertion sort, its best case is Θ(n). 20 00:01:38,946 --> 00:01:43,373 Therefore it is a good algorithm and it is stable. 21 00:01:44,079 --> 00:01:48,615 In many comprehensive application of sorting. 22 00:01:49,131 --> 00:01:55,062 the direct insertion sort is regard as a function to deal with short sequences. 23 00:01:55,576 --> 00:02:00,425 For bubble sort, its performance is very bad. 24 00:02:00,725 --> 00:02:06,897 Although for its best case that the sequence is almost in order, it is also Θ(n). 25 00:02:07,175 --> 00:02:09,508 However, it has a very large constant coefficient. We can see its performance in experiments is very bad later. 26 00:02:14,716 --> 00:02:19,233 For the direct insertion sort, its worst case is also Θ(n^2). 27 00:02:19,484 --> 00:02:23,897 Therefore it isn't a good algorithm, and it isn't stable. 28 00:02:25,789 --> 00:02:30,837 Then we look at these relatively better algorithms. 29 00:02:32,125 --> 00:02:36,195 Especially for quicksort, mergesort and heapsort, 30 00:02:36,195 --> 00:02:41,315 their average case are all in n*(log n) time. 31 00:02:42,378 --> 00:02:48,025 For binsort and radix sort, 32 00:02:48,583 --> 00:02:54,938 its performance of time is also good, since it seems 33 00:02:54,938 --> 00:03:00,463 a little like a linear one, and it is related to the sorting scale n. 34 00:03:00,904 --> 00:03:07,517 Then for this d, it seems that you divide it into some parts, so it is d*(n+r). 35 00:03:08,004 --> 00:03:12,409 Since the radix r is relatively smaller, 36 00:03:12,408 --> 00:03:16,621 much smaller than n, so it regard it as d*n. 37 00:03:16,980 --> 00:03:21,765 And d is very small. However, we can see later, this radix sort 38 00:03:21,765 --> 00:03:25,618 isn't naturally better than those sorting methods 39 00:03:26,051 --> 00:03:30,261 of n*(log n). They are in the same magnitude. 40 00:03:30,641 --> 00:03:37,504 For our sorting algorithms, for their cost of assistant space, 41 00:03:37,973 --> 00:03:44,505 mergesort, binsort and radix sort, they all use relatively larger space. 42 00:03:44,894 --> 00:03:49,831 Basically they use assistant space which is as large as the array to be sorted. 43 00:03:49,831 --> 00:03:54,884 Then, binsort and radix sort also need space for bins. 44 00:03:55,258 --> 00:04:02,666 For quicksort, its assistant space isn't small, too. Although it seems a recursive implementation, and we don't open up a new array. 45 00:04:03,083 --> 00:04:07,893 However, we use such a stack, and it also needs space. 46 00:04:07,893 --> 00:04:12,991 And in its worst case, we should reserve Θ(n) space. 47 00:04:13,520 --> 00:04:16,125 log n is just its average cost of time. 48 00:04:16,960 --> 00:04:21,725 About the stability, for these relatively better algorithms, mergesort, 49 00:04:21,725 --> 00:04:25,407 binsort and radix sort are stable. Others aren't stable. 50 00:04:27,522 --> 00:04:30,113 Now we make a brief summary for these algorithms. 51 00:04:31,740 --> 00:04:35,997 For a very small n, or when it is almost in order, 52 00:04:36,898 --> 00:04:42,123 the direct insertion sort is a very good algorithm. Also with other 53 00:04:42,868 --> 00:04:48,735 better algorithm, they are used comprehensively in some cases of application. 54 00:04:50,051 --> 00:04:53,139 For shellsort, especially we should notice to avoid 55 00:04:53,691 --> 00:04:57,205 decreasing such a sequence with an increment divided by 2. 56 00:04:57,580 --> 00:05:00,454 It will be better if we use 3 to divide it. 57 00:05:02,380 --> 00:05:06,839 Considering them comprehensively, quicksort is surely a best algorithm, 58 00:05:07,200 --> 00:05:12,958 so it is the most famous one. We use quicksort in many cases. 59 00:05:14,335 --> 00:05:18,580 Now we use a set of data to test them. 60 00:05:20,033 --> 00:05:27,062 About the environment of this test, we use such hardware and software, which are already a little old. 61 00:05:27,636 --> 00:05:32,057 We start with producing the data for this test. 62 00:05:32,057 --> 00:05:36,788 When we do this, we use such a random function. 63 00:05:37,286 --> 00:05:41,666 And we are able to produce a random array. So we can 64 00:05:41,666 --> 00:05:45,355 produce all kinds of different arrays according to our request. 65 00:05:46,247 --> 00:05:49,913 Now we look at the test for time. 66 00:05:50,529 --> 00:05:54,538 Firstly, we read the time on this computer. 67 00:05:55,167 --> 00:05:59,175 Then we run our sorting algorithm. 68 00:05:59,612 --> 00:06:04,162 After sorting, we read the time again, and make the difference of the two times 69 00:06:04,664 --> 00:06:09,628 as the cost of time of our process of sorting. 70 00:06:11,675 --> 00:06:17,042 Besides, we have this framework for a time testing of sorting algorithms. 71 00:06:17,839 --> 00:06:21,315 At the beginning, we read the time of the computer. 72 00:06:22,261 --> 00:06:25,350 Then, we sort such an array. 73 00:06:26,476 --> 00:06:33,575 We can see that we divide the whole process into several parts. Its meaning is that, 74 00:06:34,247 --> 00:06:39,191 for some short sequence, it is possible that the difference of this two times 75 00:06:39,191 --> 00:06:44,562 is almost 0, so we can't get the result that we want. 76 00:06:44,562 --> 00:06:51,455 That is to say, the time we read before sorting, and the time we read after sorting, their difference is 0, so we can't see the difference. 77 00:06:51,935 --> 00:06:58,530 For this kind of cases, we need to test this small scale of data repeatedly. Then we get the difference of two times, 78 00:06:58,531 --> 00:07:05,323 and then divide it by the amount of groups of this small scale of data, so we get a testing result. 79 00:07:06,508 --> 00:07:08,682 OK, then, 80 00:07:08,683 --> 00:07:14,500 that is our testing framework, and it is very simple. Firstly we set 81 00:07:15,334 --> 00:07:22,436 a time node, and then we sort several groups of data with the same scale. 82 00:07:23,301 --> 00:07:29,293 After sorting, we read the time again. According to the amount of groups, 83 00:07:29,719 --> 00:07:34,979 we divide the difference of this two times by the amount of groups, then we get 84 00:07:35,468 --> 00:07:41,456 the time cost of our sorting method for this scale of data. 85 00:07:42,087 --> 00:07:47,008 Now we have a look at this experiment result of sorting. 86 00:07:47,478 --> 00:07:51,237 For this kind of data with a scale of ten records to be sorted, 87 00:07:51,237 --> 00:07:55,633 or one hundred, one thousand, cases of such a small scale, 88 00:07:55,633 --> 00:07:59,222 actually we use the method we just said. 89 00:08:00,362 --> 00:08:06,302 We can see from this result, in cases of a small scale, 90 00:08:06,816 --> 00:08:10,683 the performance of the direct insertion sort is the best. For cases of data almost in order, 91 00:08:11,382 --> 00:08:16,274 or we can see in this special case, the data is in order, 92 00:08:16,635 --> 00:08:20,669 its performance is very good. Thus, it is often 93 00:08:21,155 --> 00:08:25,133 combined with other methods and used to deal with data of a small scale. 94 00:08:26,802 --> 00:08:30,559 In the whole sorting process, 95 00:08:30,559 --> 00:08:33,965 we can see that, 96 00:08:34,756 --> 00:08:40,631 bubble sort and selection sort, they are relatively worse. Moreover, it is interesting that 97 00:08:41,745 --> 00:08:46,419 for the case in order, selection sort is worse than bubble sort. This bubble sort 98 00:08:46,810 --> 00:08:51,992 only needs to scan one round when it is the case in order. 99 00:08:52,513 --> 00:08:58,517 However, for selection sort, it goes on again and again, so it is very bad for the case in order. 100 00:08:59,356 --> 00:09:03,369 But for general random cases, 101 00:09:03,880 --> 00:09:07,506 Bubble sort is much worse than this selection sort. Of course, 102 00:09:08,274 --> 00:09:13,506 for cases of small scale, it is also much worse than the direct insertion sort. 103 00:09:14,187 --> 00:09:18,441 Comparing methods decreasing with an increment divided by 2 and by 3, 104 00:09:18,441 --> 00:09:24,681 it is obvious that the increment divided by 2 is much worse. 105 00:09:24,681 --> 00:09:30,085 For quicksort, its performance is very good. It is obvious 106 00:09:30,421 --> 00:09:36,995 a good algorithm. Moreover, we optimize quicksort, which means, 107 00:09:36,995 --> 00:09:41,111 for sequences of a small scale, we use 108 00:09:41,383 --> 00:09:44,549 such a direct insertion sort instead of recursion. 109 00:09:44,904 --> 00:09:50,625 Its performance is better. Now we can see, in our this set of testing data, 110 00:09:51,761 --> 00:09:57,783 For sequences whose length less than 28, we don't use recursion, and we 111 00:09:58,360 --> 00:10:02,101 deal with them using such a direct insertion sort, and then it reaches its best performance. 112 00:10:02,826 --> 00:10:08,065 It is similar to mergesort, which is to say that we also don't use recursion for sequences of a small scale. 113 00:10:08,065 --> 00:10:12,718 Now we look at the performance of radix sort. 114 00:10:13,172 --> 00:10:15,059 We can see that 115 00:10:16,872 --> 00:10:20,986 with different radixes, the performance is different. 116 00:10:21,426 --> 00:10:26,753 For our this 64-bit computer, 117 00:10:26,754 --> 00:10:31,737 for such a long integer, when we deal with it, 118 00:10:32,255 --> 00:10:36,417 we use 32 as the radix, 119 00:10:36,935 --> 00:10:38,355 and it reaches its best performance. 120 00:10:41,151 --> 00:10:44,477 Now we analyse the lower bounds of sorting problem. 121 00:10:44,476 --> 00:10:48,040 As we know, for a sorting problem, 122 00:10:48,690 --> 00:10:52,485 we need to scan for one round. 123 00:10:52,485 --> 00:10:57,480 We spend Ω(n) time to look at the data. 124 00:10:57,480 --> 00:11:01,820 Now let us have a think, are we able to reach this lower bound of Ω(n)? 125 00:11:01,820 --> 00:11:07,365 Actually we aren't. We've known that 126 00:11:07,801 --> 00:11:12,749 its upper bound is O(nlogn). 127 00:11:13,348 --> 00:11:20,103 Now we look at its lower bound, we can find it out using a decision tree later. 128 00:11:20,634 --> 00:11:25,782 and its lower bound is Ω(nlogn). We can 129 00:11:26,146 --> 00:11:30,565 use such a decision tree to look at this sorting problem. 130 00:11:31,327 --> 00:11:35,680 For a sorting problem, if 131 00:11:35,989 --> 00:11:41,045 for three strictly different data, 132 00:11:41,597 --> 00:11:46,288 the amount of the sorting result we can get 133 00:11:46,783 --> 00:11:51,298 is the amount of their permutation. 134 00:11:52,879 --> 00:11:58,359 For our sorting algorithms based on comparison, firstly we compare A and B, 135 00:11:58,359 --> 00:12:05,442 and there are two conditions, which means, A is smaller than B, or B is smaller than A. 136 00:12:05,442 --> 00:12:09,274 Then we divided it into two child nodes, and go on the comparison. 137 00:12:09,975 --> 00:12:16,972 Now we look at that if A is smaller than B, whether it is also smaller than C. 138 00:12:17,368 --> 00:12:24,058 If A is smaller than C, then we can get such a group of possible results. 139 00:12:24,386 --> 00:12:32,236 Then we judge about B and C. Finally, 140 00:12:32,543 --> 00:12:36,623 We get a certain result at the leaf node. 141 00:12:37,023 --> 00:12:40,715 It is similar for the analysis for other nodes. 142 00:12:41,304 --> 00:12:45,735 Now we can see that this leaf node is the situation 143 00:12:46,592 --> 00:12:50,870 of a possible order of the three data that we just said. 144 00:12:50,869 --> 00:12:54,909 The possible results justly appear on all the leaf nodes. 145 00:12:55,946 --> 00:13:02,333 It is a permutation tree, which means the factorial of n. 146 00:13:02,866 --> 00:13:06,613 So the amount of leaf nodes is the factorial of n. 147 00:13:07,074 --> 00:13:13,966 According to the nature of binary tree, we know the depth of the tree is at least 148 00:13:14,416 --> 00:13:19,357 long(n!). So its 149 00:13:19,358 --> 00:13:23,068 corresponding magnitude is nlogn. 150 00:13:23,601 --> 00:13:27,159 And as we know, in this decision tree, 151 00:13:27,516 --> 00:13:34,214 for its smallest depth, we need to compare at most log(n!) times, 152 00:13:34,214 --> 00:13:38,510 which means Ω(nlogn). 153 00:13:39,055 --> 00:13:43,834 So we can come up with such a conclusion. In its worst situation, 154 00:13:44,386 --> 00:13:51,293 for any sorting algorithm based on comparison, it at least needs Ω(nlogn) time, 155 00:13:51,675 --> 00:13:57,324 which means, its lower bound can't be smaller than this. 156 00:13:57,786 --> 00:14:04,034 And we know the upper bound is O(nlogn), so we can say that 157 00:14:04,034 --> 00:14:09,970 the time cost of a sorting problem is Θ(nlogn). 158 00:14:12,407 --> 00:14:19,103 There is a special case, radix sort. We've known its cost of time, 159 00:14:19,877 --> 00:14:24,242 and d indicates the amount of parts we divide it into. 160 00:14:24,710 --> 00:14:28,517 We have n data to be sorted. 161 00:14:29,184 --> 00:14:32,672 r is the amount of bins, which is our radix. 162 00:14:33,425 --> 00:14:39,227 Since r is much smaller than n, it looks like d*n. For d, 163 00:14:39,728 --> 00:14:43,574 it is usually a small number. Some people regard it as a constant. 164 00:14:44,224 --> 00:14:48,858 Actually it isn't a constant, and it is related to n. 165 00:14:49,625 --> 00:14:53,959 We can see that, if we want to indicate n different 166 00:14:54,487 --> 00:14:59,181 key values, and our radix is r, 167 00:14:59,830 --> 00:15:05,036 then for this case of such different data, we need 168 00:15:05,617 --> 00:15:09,540 the logarithm of n based on r bits to indicate it. 169 00:15:10,186 --> 00:15:17,309 Therefore, this d is inside Ω(logn). 170 00:15:17,975 --> 00:15:21,454 For our this radix sort, its time cost 171 00:15:21,804 --> 00:15:27,203 is also a magnitude of nlogn, so it isn't linear. 172 00:15:27,704 --> 00:15:30,977 Finally, I leave you some questions to think about. 173 00:15:30,976 --> 00:15:34,525 Firstly, our whole internal sorting 174 00:15:35,770 --> 00:15:40,312 is done in the case of arrays. So are we able to 175 00:15:40,312 --> 00:15:45,426 use this NEW to open up a list of pointers 176 00:15:45,883 --> 00:15:51,974 to implement some related sorting algorithms? After implementation, 177 00:15:52,506 --> 00:15:57,461 what is the difference of their performance? Besides, 178 00:15:58,102 --> 00:16:02,385 we've known the stability of these sorting algorithms. 179 00:16:03,093 --> 00:16:07,788 Then for the stable algorithms, can we change them slightly and 180 00:16:08,152 --> 00:16:15,170 make them not stable any more. Of course the change shouldn't be too big. In fact, in most cases, for these 181 00:16:15,471 --> 00:16:22,568 larger than and not smaller than, we can think about how to change them, or for smaller than and not larger than. 182 00:16:23,471 --> 00:16:25,984 Another thing is that 183 00:16:26,764 --> 00:16:34,030 for those unstable algorithms, are we able to modify them and make them stable? And 184 00:16:34,562 --> 00:16:39,720 is it necessarily for us to make them stable? Why do we always 185 00:16:40,350 --> 00:16:44,078 or in most cases use those unstable algorithms? For example, 186 00:16:44,756 --> 00:16:49,740 for selection sort, such a simple algorithm, however it is unstable. So we think, 187 00:16:50,030 --> 00:16:54,574 how about to make it stable? But we can look at its algorithm behind. 188 00:16:55,115 --> 00:17:00,643 It is unnecessarily complicated, so we don't use this algorithm. 189 00:17:01,711 --> 00:17:06,654 We can also survey the algorithms and functions in STL, 190 00:17:07,179 --> 00:17:12,076 and find out how they are combined. We can have a look at that 191 00:17:12,326 --> 00:17:17,799 in what cases it chooses direct insertion sort, in what cases it chooses quicksort and in what cases it chooses heapsort. 192 00:17:18,274 --> 00:17:20,576 That is very interesting. Thank you.