1 00:00:01,296 --> 00:00:02,404 大家好 2 00:00:02,901 --> 00:00:03,979 数据结构与算法 3 00:00:04,066 --> 00:00:07,081 这一讲我们继续学习第四章字符串 4 00:00:08,006 --> 00:00:11,308 今天我们介绍字符串的存储结构 5 00:00:11,395 --> 00:00:13,225 以及基本运算的实现 6 00:00:13,305 --> 00:00:16,930 对于串长变化不大的情况呢 7 00:00:17,297 --> 00:00:20,336 有几种比较经典的实现方案 8 00:00:21,546 --> 00:00:23,015 首先我们要考虑的是 9 00:00:23,095 --> 00:00:24,817 字符串的长度 10 00:00:24,898 --> 00:00:28,245 因为字符串它还是一个相对比较 11 00:00:28,333 --> 00:00:30,259 动态变化的数据结构 12 00:00:30,340 --> 00:00:33,476 它的长度一会儿大一会儿小 13 00:00:33,562 --> 00:00:37,419 我们怎么样把这个值给保存下来 14 00:00:37,506 --> 00:00:39,445 在Pascal里面呢 15 00:00:39,508 --> 00:00:42,797 它就是用s0来记录它的长度 16 00:00:42,853 --> 00:00:44,870 但是它的缺陷呢就是 17 00:00:44,934 --> 00:00:47,216 我们只有一个字节的 18 00:00:47,280 --> 00:00:51,480 这样的长度来存串长值 19 00:00:51,824 --> 00:00:55,052 因此这个字符串整体的长度 20 00:00:55,132 --> 00:00:57,272 就不能够超过256个 21 00:00:57,807 --> 00:01:00,963 我是不是可以把这个串长 22 00:01:01,026 --> 00:01:02,069 这样的一个值 23 00:01:02,157 --> 00:01:03,936 存在另外一个变量里面 24 00:01:04,000 --> 00:01:05,713 但是这种方法呢 25 00:01:05,770 --> 00:01:08,344 我们字符串它的动态的实现 26 00:01:08,432 --> 00:01:11,269 就不是能够很好的支持 27 00:01:12,190 --> 00:01:13,989 在C和C++里面 28 00:01:14,068 --> 00:01:17,669 它用的是一个特殊的结束符 29 00:01:17,754 --> 00:01:21,059 \0来表示字符串的结束符 30 00:01:21,611 --> 00:01:23,882 这个\0呢 31 00:01:24,607 --> 00:01:27,676 实际上它的值就是0 32 00:01:28,316 --> 00:01:31,994 也等价于我们常用的NULL 33 00:01:32,066 --> 00:01:33,737 这样的一个常量 34 00:01:35,403 --> 00:01:39,071 字符串类可以更方便的 35 00:01:39,136 --> 00:01:41,462 来支持字符串的表达 36 00:01:41,526 --> 00:01:43,885 这里是一个字符串的数组 37 00:01:43,949 --> 00:01:46,778 我们字符串数组str 38 00:01:46,839 --> 00:01:48,793 以及串的长度size 39 00:01:48,867 --> 00:01:51,922 都是私有的成员 40 00:01:51,986 --> 00:01:54,549 不能够直接的访问 41 00:01:54,621 --> 00:01:57,511 我们可以通过一些函数的定义 42 00:01:57,591 --> 00:02:00,818 来间接的访问这两个值 43 00:02:00,897 --> 00:02:03,363 在C和C++里面 44 00:02:03,450 --> 00:02:05,931 常用的有这些基础的运算 45 00:02:06,011 --> 00:02:07,379 像求串长 46 00:02:07,755 --> 00:02:09,977 以及对串进行复制 47 00:02:10,057 --> 00:02:12,352 还有串的拼接和串的比较 48 00:02:13,068 --> 00:02:15,928 我们来看求串长的这个函数 49 00:02:16,696 --> 00:02:19,306 因为我们有这么一个\0 50 00:02:19,947 --> 00:02:22,313 它作为串结束的一个表示 51 00:02:22,384 --> 00:02:23,744 其实它就相当于 52 00:02:23,819 --> 00:02:25,369 我们在链表运算里面 53 00:02:25,449 --> 00:02:27,731 那个结束的next域 54 00:02:27,803 --> 00:02:29,006 它指向NULL 55 00:02:29,647 --> 00:02:32,404 我们在对这个串 56 00:02:32,499 --> 00:02:33,816 来判断它的长度的时候 57 00:02:33,904 --> 00:02:37,307 我们首先设置下标值i 58 00:02:38,156 --> 00:02:39,918 它的起始值是0 59 00:02:39,982 --> 00:02:44,124 然后我们从0一直往下面挨个找 60 00:02:44,808 --> 00:02:49,101 如果是第i它等于\0了 61 00:02:49,165 --> 00:02:50,190 那就结束 62 00:02:50,270 --> 00:02:53,251 那我们这个计数的结果呢 63 00:02:53,315 --> 00:02:55,064 就是我们的串长值 64 00:02:55,607 --> 00:02:57,122 字符串复制呢 65 00:02:57,178 --> 00:02:59,216 我们其实不能够直接的等于 66 00:02:59,289 --> 00:03:00,838 这样的给它拷贝过去 67 00:03:00,918 --> 00:03:06,952 因为对于这两个字符串来说呢 68 00:03:07,024 --> 00:03:09,681 它都有独立的存储的空间 69 00:03:09,753 --> 00:03:14,517 我们必须是把这个原串s 70 00:03:14,581 --> 00:03:15,608 里面的字符呢 71 00:03:15,688 --> 00:03:19,540 挨个的去拷到目标串d里面去 72 00:03:19,612 --> 00:03:22,273 如果你直接让d=s 73 00:03:22,338 --> 00:03:25,960 那结果就是把s的这个存储地址 74 00:03:26,684 --> 00:03:28,795 给它拷贝到了d里面 75 00:03:28,875 --> 00:03:32,199 d它所保留的那个字符串的空间 76 00:03:32,281 --> 00:03:33,175 就消失了 77 00:03:34,724 --> 00:03:35,735 拷贝的过程呢 78 00:03:36,335 --> 00:03:39,233 其实我们也是从下标0位开始 79 00:03:39,313 --> 00:03:42,164 然后在原串s 80 00:03:42,217 --> 00:03:44,778 它还没有达到\0之前 81 00:03:44,870 --> 00:03:46,809 然后我们就挨个的 82 00:03:46,872 --> 00:03:49,481 让d[i]=s[i] 83 00:03:49,538 --> 00:03:53,036 然后我们下标位i再++ 84 00:03:54,371 --> 00:03:55,505 最后呢 85 00:03:55,560 --> 00:04:00,664 我们要手工的去把d[i]那个位置 86 00:04:00,736 --> 00:04:02,871 把它赋为\0 87 00:04:02,951 --> 00:04:04,955 这就使得我们这个串呢 88 00:04:05,032 --> 00:04:06,598 是一个正常的串 89 00:04:08,842 --> 00:04:11,046 字符串的比较操作 90 00:04:11,134 --> 00:04:13,690 在很多运算里面也是非常重要的 91 00:04:13,753 --> 00:04:15,608 这个比较的运算呢 92 00:04:16,397 --> 00:04:21,610 我们是采用的字典序来进行的 93 00:04:21,682 --> 00:04:25,860 我们把数组的下标变量i 94 00:04:25,932 --> 00:04:27,055 先设为0 95 00:04:27,144 --> 00:04:31,343 然后我从这个s1和s2 96 00:04:31,415 --> 00:04:33,381 它相应的i位置 97 00:04:33,452 --> 00:04:35,298 逐个的往后面比 98 00:04:36,113 --> 00:04:38,951 如果它们两都还没有结束的时候 99 00:04:39,335 --> 00:04:41,816 我看当前位置s1[i] 100 00:04:41,888 --> 00:04:43,980 是不是大于s2[i] 101 00:04:44,052 --> 00:04:45,528 如果大我返回1 102 00:04:45,845 --> 00:04:48,885 如果是s1[i]小于s2[i] 103 00:04:48,965 --> 00:04:50,295 我返回-1 104 00:04:50,374 --> 00:04:51,721 如果它们相等 105 00:04:51,801 --> 00:04:54,797 那我就继续比较下一个位置 106 00:04:54,885 --> 00:04:55,970 那个字符值 107 00:04:56,051 --> 00:04:57,176 除了这个循环 108 00:04:57,253 --> 00:04:58,850 我们有几种情况 109 00:04:59,290 --> 00:05:01,767 首先就是s1[i]呢 110 00:05:01,831 --> 00:05:03,609 它是等于\0 111 00:05:03,688 --> 00:05:05,056 就是它结束了 112 00:05:05,676 --> 00:05:08,148 那s2[i]它还没有结束 113 00:05:08,235 --> 00:05:12,859 这种情况呢就是s1小于s2 114 00:05:13,403 --> 00:05:14,323 返回-1 115 00:05:15,028 --> 00:05:16,633 如果s2结束 116 00:05:16,713 --> 00:05:18,623 而s1没有结束 117 00:05:18,695 --> 00:05:19,726 这种情况呢 118 00:05:19,798 --> 00:05:21,866 s1大于s2 119 00:05:21,938 --> 00:05:23,389 返回+1 120 00:05:24,035 --> 00:05:29,189 否则呢就是s1[i]和s2[i]呢 121 00:05:29,265 --> 00:05:32,457 它们都是\0 122 00:05:32,537 --> 00:05:36,017 它前面的这些字符值都是相等的 123 00:05:36,089 --> 00:05:37,967 而最后它们的结束符 124 00:05:38,044 --> 00:05:39,692 也是在相同的位置截止 125 00:05:39,771 --> 00:05:42,394 所以它们是完全相等的 126 00:05:42,474 --> 00:05:44,427 然后我们就返回0 127 00:05:47,575 --> 00:05:49,921 比较字符串大小的运算呢 128 00:05:49,985 --> 00:05:52,513 我们还有一个更简洁的版本 129 00:05:53,490 --> 00:05:56,726 这个版本它就是把\0本身 130 00:05:56,807 --> 00:05:59,018 当做一个特殊的符号 131 00:05:59,098 --> 00:06:01,646 来直接参与了运算 132 00:06:03,638 --> 00:06:06,296 首先我们设置下标i变量 133 00:06:06,384 --> 00:06:09,183 从0开始往后面比 134 00:06:09,838 --> 00:06:13,344 如果是s1[i]跟s2[i] 135 00:06:13,408 --> 00:06:14,705 在这个位置相等 136 00:06:14,784 --> 00:06:17,320 那我们就在这个循环里面继续操作 137 00:06:18,928 --> 00:06:19,838 我们判断一下 138 00:06:20,468 --> 00:06:23,899 当前这个s1[i]和s2[i] 139 00:06:23,962 --> 00:06:26,942 它是不是都同时等于\0 140 00:06:27,382 --> 00:06:29,196 如果是那就表示它们 141 00:06:29,284 --> 00:06:31,743 恰好在相同的位置 142 00:06:31,829 --> 00:06:32,886 都相应相等 143 00:06:32,975 --> 00:06:36,742 而且在相同的位置是\0结束 144 00:06:36,825 --> 00:06:38,421 所以它们是相等的 145 00:06:39,005 --> 00:06:40,886 如果是不等的情况下 146 00:06:40,954 --> 00:06:44,841 把s1[i]-s2[i] 147 00:06:44,905 --> 00:06:49,238 而且再除以它们差值的绝对值 148 00:06:49,533 --> 00:06:53,012 如果s1[i]大于s2[i]呢 149 00:06:53,076 --> 00:06:54,905 就返回了+1 150 00:06:55,212 --> 00:06:58,872 如果s1[i]小于s2[i]呢 151 00:06:58,936 --> 00:07:00,296 就返回了-1 152 00:07:00,673 --> 00:07:03,077 它们会不会在这里相等 153 00:07:03,164 --> 00:07:04,442 那是不会的 154 00:07:04,530 --> 00:07:07,061 因为我们前面这个循环语句 155 00:07:07,149 --> 00:07:10,658 就保证了它相等的时候 156 00:07:10,738 --> 00:07:13,335 就不会跳出这个循环 157 00:07:13,406 --> 00:07:15,208 或者是说有一种特殊情况 158 00:07:15,826 --> 00:07:17,033 它们相等了 159 00:07:17,116 --> 00:07:19,672 但是都等于\0 160 00:07:19,752 --> 00:07:22,704 那就整个跳出这个函数返回0 161 00:07:22,798 --> 00:07:23,743 所以呢 162 00:07:23,823 --> 00:07:26,034 我们也保证了它的分母 163 00:07:26,112 --> 00:07:27,563 是不会等于0的 164 00:07:29,670 --> 00:07:32,209 字符串类的一些基本运算 165 00:07:32,289 --> 00:07:33,764 首先是构造函数 166 00:07:33,843 --> 00:07:36,894 如果我传入了字符串s 167 00:07:36,966 --> 00:07:38,341 作为它的初始值 168 00:07:38,896 --> 00:07:41,938 那我们首先获得s 169 00:07:42,026 --> 00:07:43,942 它的长度size 170 00:07:44,468 --> 00:07:47,675 然后我申请一个size+1的 171 00:07:47,747 --> 00:07:49,922 这么一个存储空间 172 00:07:50,306 --> 00:07:53,416 这个+1就是预留给\0的 173 00:07:53,924 --> 00:07:55,634 我们必须要保证 174 00:07:56,289 --> 00:07:59,419 我申请的存储空间是有效的 175 00:07:59,510 --> 00:08:02,453 如果在特殊的情况下 176 00:08:02,547 --> 00:08:05,256 我们分配不到存储空间的时候呢 177 00:08:05,320 --> 00:08:07,486 要进行异常处理 178 00:08:07,573 --> 00:08:09,003 而且退出这个程序 179 00:08:10,933 --> 00:08:13,556 获得了有效的存储空间以后 180 00:08:13,620 --> 00:08:18,105 我就调用前面的字符串拷贝的函数 181 00:08:18,178 --> 00:08:20,899 把这个初始值s呢 182 00:08:20,980 --> 00:08:23,483 给它拷贝到我们新申请的 183 00:08:23,547 --> 00:08:24,955 空间str里面 184 00:08:27,156 --> 00:08:29,160 字符串的析构函数 185 00:08:29,231 --> 00:08:32,662 主要是把我们申请的这个空间呢 186 00:08:32,729 --> 00:08:33,700 释放掉 187 00:08:34,966 --> 00:08:38,201 我们再介绍字符串的赋值运算 188 00:08:38,769 --> 00:08:40,215 这个等于号呢 189 00:08:40,295 --> 00:08:44,425 是对运算符等于的一个重载 190 00:08:44,497 --> 00:08:46,381 使得我们在写 191 00:08:46,455 --> 00:08:48,350 字符串的相关运算的时候呢 192 00:08:48,422 --> 00:08:49,399 非常的方便 193 00:08:51,382 --> 00:08:54,733 首先我们是要判断一下 194 00:08:55,355 --> 00:08:59,217 我当前这个字符串它的长度 195 00:08:59,280 --> 00:09:02,217 是不是跟我们要复制的 196 00:09:02,286 --> 00:09:03,875 这个目标串的长度呢 197 00:09:03,939 --> 00:09:05,232 是相等的 198 00:09:06,000 --> 00:09:07,024 如果不是呢 199 00:09:07,096 --> 00:09:08,736 我把当前串的 200 00:09:08,824 --> 00:09:11,437 原来的存储空间给它释放掉 201 00:09:11,525 --> 00:09:18,117 再申请一个目标串s的长度+1 202 00:09:18,471 --> 00:09:20,155 这样的一个大小的空间 203 00:09:20,534 --> 00:09:23,018 也要保证当前的这个空间 204 00:09:23,090 --> 00:09:23,959 是有效的 205 00:09:24,311 --> 00:09:27,809 然后我把当前这个字符串的长度 206 00:09:27,881 --> 00:09:30,309 设为跟s的长度一样 207 00:09:30,853 --> 00:09:34,995 然后我再调用strcpy这个函数 208 00:09:35,052 --> 00:09:37,490 把我的目标串 209 00:09:38,259 --> 00:09:41,773 它设置为我的原串的 210 00:09:42,093 --> 00:09:43,555 那个字符串的值 211 00:09:44,987 --> 00:09:47,331 然后我们要返回这个实例 212 00:09:47,395 --> 00:09:51,326 就是return *this 这个指针 213 00:09:51,996 --> 00:09:53,893 然后呢返回以后 214 00:09:53,943 --> 00:09:56,101 这样我们就可以支持 215 00:09:56,157 --> 00:09:57,761 连续的赋值操作 216 00:09:57,833 --> 00:10:00,627 也就是连续的几个等号 217 00:10:00,691 --> 00:10:01,834 都是可以支持的 218 00:10:01,914 --> 00:10:03,641 给大家留一下思考 219 00:10:04,441 --> 00:10:06,863 首先就是对String 220 00:10:06,927 --> 00:10:09,667 抽取子串的这么一个运算 221 00:10:09,735 --> 00:10:13,035 就是Substr这个方法的实现 222 00:10:13,107 --> 00:10:14,915 我们传入的参数呢 223 00:10:14,994 --> 00:10:19,891 是这个Substr里面的起始下标 224 00:10:19,975 --> 00:10:20,936 index 225 00:10:21,000 --> 00:10:22,460 这里是等于6 226 00:10:22,532 --> 00:10:26,650 然后呢是我要连续拷贝几个字符 227 00:10:26,722 --> 00:10:28,127 这里是5 228 00:10:28,753 --> 00:10:31,900 我们大家来实现一下 229 00:10:31,980 --> 00:10:32,978 这个函数 230 00:10:33,331 --> 00:10:34,996 要注意一些边界的处理 231 00:10:35,602 --> 00:10:39,410 然后呢 再看一下这两个思考题 232 00:10:40,020 --> 00:10:43,258 就是对于s1和s2 233 00:10:43,338 --> 00:10:45,530 这两个字符串 234 00:10:47,068 --> 00:10:47,913 如果我 235 00:10:47,993 --> 00:10:49,370 这个+呢 236 00:10:49,451 --> 00:10:52,509 是对String cate 237 00:10:52,593 --> 00:10:54,590 这样的拼接运算的一个重载 238 00:10:55,468 --> 00:11:01,387 我们要使得s1+s2==s2+s1 239 00:11:01,466 --> 00:11:03,961 应该是满足哪些条件 240 00:11:06,931 --> 00:11:09,218 对于一个字符串来说 241 00:11:09,795 --> 00:11:11,927 我怎么样来对它 242 00:11:12,007 --> 00:11:14,308 进行一个逆值的操作 243 00:11:14,916 --> 00:11:18,057 而且我不能够采用新的存储空间 244 00:11:18,137 --> 00:11:21,182 只能是在原来的这个存储空间上 245 00:11:21,259 --> 00:11:22,458 来实现 246 00:11:22,538 --> 00:11:26,326 例如我对于一个字符串abcde 247 00:11:26,406 --> 00:11:31,387 然后把它变为edcba 248 00:11:32,821 --> 00:11:33,796 谢谢大家