1	
00:00:00,000 --> 00:00:06,160	
大家好。	Hello everyone.
	
2	
00:00:06,160 --> 00:00:12,320	
数据结构与算法这一讲我们开始第9章文件和外排序的学习。	This lecture of data structure and algorithm starts talking about Chapter 9 file and external sorting.
	
3	
00:00:12,320 --> 00:00:17,420	
那我们介绍一下主存外存还有文件的组织和管理。	Then we introduce the main memory, external memory and file organization and management.
	
4	
00:00:17,420 --> 00:00:22,590	
我们计算机的存储器主要分为主存	The storage of computer is mainly divided into the main memory,
	
5	
00:00:22,590 --> 00:00:27,030	
还有外存。主存里面我们常见的就是RAM，	and external memory. Main memory is commonly RAM.
	
6	
00:00:27,030 --> 00:00:30,800	
那它是一种随机访问的存储器。	It is a random access memory .
	
7	
00:00:30,800 --> 00:00:35,740	
另外还有高速缓存，视频存储这也是属于主存。	Another one is cache. The video storage is also main memory.
	
8	
00:00:35,740 --> 00:00:41,180	
还有外存呢，它最主流的就是磁盘。	The mainstream of external memory is disk.
	
9	
00:00:41,180 --> 00:00:49,940	
那还有我们也有一些磁带等 这样用于存储大量信息以及永久存储的这样的设备。	We also have some other devices such as tapes used to store large amounts of information and permanently.
	
10	
00:00:49,940 --> 00:00:55,780	
那我们看一下存储器的这样的一个结构。	Then we look at the structure of memory.
	
11	
00:00:55,780 --> 00:00:59,925	
主存和高速缓存呢，它们是	Main memory and cache are
	
12	
00:00:59,925 --> 00:01:04,070	
基本的存储器，它是跟CPU可以直接打交道的。	the basic storage. It is possible to deal directly with the CPU .
	
13	
00:01:04,070 --> 00:01:10,200	
在有些数据，如果它的数据量特别大，主存	In some cases, especially if it is a large amount of data , the main memory
	
14	
00:01:10,200 --> 00:01:16,330	
放不下，那我们就需要用到磁盘这样的辅助的存储设备。	does not fit, we need to use disk as auxiliary devices.
	
15	
00:01:16,330 --> 00:01:23,770	
另外呢，我们因为主存它的数据是断电以后就会消失的，	In addition, our data in main memory will disappear if power fails.
	
16	
00:01:23,770 --> 00:01:28,030	
那我们如果这个数据处理了以后，我需要比较长久的	If after this data is dealt with, I need to a long-term
	
17	
00:01:28,030 --> 00:01:32,290	
保存以及传输，我们要放到磁盘里面。	preservation and transmission, we need to put it into the disk.
	
18	
00:01:32,290 --> 00:01:39,705	
那有些特别大量的数据， 我们需要永久的保留，那还要用到3G的	For some particularly large amount of data, we need to retain permanent, and that also need 3G
	
19	
00:01:39,705 --> 00:01:43,920	
存储设备，像磁带啊，光盘这样的结构。	storage devices, such as tape and light disc.
	
20	
00:01:43,920 --> 00:01:49,390	
那我们看一下磁盘的物理结构。	Then we look at the physical structure of the disk.
	
21	
00:01:49,390 --> 00:01:54,860	
磁盘呢，它是由这么一个桌面以及若干个碟片	Disk is made up by a desktop, several discs
	
22	
00:01:54,860 --> 00:01:58,930	
还有机械手臂来组成的。那我们碟片呢	and a mechanical arm. The disc
	
23	
00:01:58,930 --> 00:02:03,000	
围着这个主轴这样不断的转。	rotates around the spindle continually.
	
24	
00:02:03,000 --> 00:02:09,385	
而这个主轴上头呢，有这么若干个碟片，每个碟片上呢它都是	And on this spindle, there are several discs, each disc
	
25	
00:02:09,385 --> 00:02:15,770	
有一圈一圈的这个磁道，数据就在这个磁道上面。	has some tracks. The data is on this track.
	
26	
00:02:15,770 --> 00:02:21,020	
那为了访问不同磁道上面的这些数据，那我们	In order to access the data in different tracks, we
	
27	
00:02:21,020 --> 00:02:26,270	
会有这样的一个活动臂，它带着这个磁盘的读写头	have such a movable arm. It moves left and right with the disk read-write head.
	
28	
00:02:26,270 --> 00:02:31,320	
这样的左右的移动，然后我们就可以	Then we can
	
29	
00:02:31,320 --> 00:02:36,370	
准确的定位到哪一个碟片的哪一个磁道以及磁道上的	accurately go to a certain block in a certain track on a certain disc,
	
30	
00:02:36,370 --> 00:02:39,900	
哪一块，我们一般称为扇区。	generally called sector.
	
31	
00:02:39,900 --> 00:02:46,680	
那我们看一下这个磁盘的这样的信息的组织。	Then we look at the organization of information on the disk.
	
32	
00:02:46,680 --> 00:02:53,560	
在早期呢，这个磁盘的这些磁道上头这一圈圈上面的这些数据呢，	In early, the data on these tracks 
	
33	
00:02:53,560 --> 00:02:58,550	
我们每一个磁道上头存的数据的量	on the disk has the same amount.
	
34	
00:02:58,550 --> 00:03:03,540	
是一样的。这样的话，我们在内圈它就更密集些，外圈呢更稀疏。	In this case, it is more dense in the inner circle, and more sparse in the outer circle.
	
35	
00:03:03,540 --> 00:03:09,960	
那现在的发展呢，我们可以在这个外圈呢存储	Now, we can see there are more data in the outer circle.
	
36	
00:03:09,960 --> 00:03:16,400	
更多的数据，那就是不是这样的一个原来这种形式了。	It is not such an original form.
	
37	
00:03:16,400 --> 00:03:23,670	
那我们来看一下，在每一个磁道上头这些数据块的组织。	Then we look at the organization of these data blocks on each track.
	
38	
00:03:23,670 --> 00:03:28,100	
那我们是以这样的一个一个的页块来组织的。	We use such a one page block to organize.
	
39	
00:03:28,100 --> 00:03:32,530	
每一个页块呢，我们根据这个磁盘的	Each page block differs from the disks,
	
40	
00:03:32,530 --> 00:03:39,010	
这样的不同，以及操作系统或者是数据库它的这个文件管理的这个组织不同，	the operating system or database file management,
	
41	
00:03:39,010 --> 00:03:45,590	
我们可能分的大小不一样，那主流的呢就是有512字节和	We may divide into different sizes, and the mainstream is to have 512 bytes or
	
42	
00:03:45,590 --> 00:03:53,590	
1024字节这样的一些大小。那我们	1024 bytes. Then the advantage
	
43	
00:03:53,590 --> 00:03:57,615	
分成这样的页块的好处呢就是我们没必要去	for us to divide it into blocks is that we do not need to 
	
44	
00:03:57,615 --> 00:04:01,640	
在磁盘上按照一个一个字节的这种访问单位，	access unit of a byte but
	
45	
00:04:01,640 --> 00:04:06,120	
我们按这一个一个页块，每次读若干个字节，	the unit of a page, which is several bytes.
	
46	
00:04:06,120 --> 00:04:10,600	
那我们可以节省这样的定位到页块的时间。	Then we can save the time of this page positioning.
	
47	
00:04:10,600 --> 00:04:16,595	
那在早期呢，它的旋转速度是比较低的，	In early, the rotational speed is relatively low,
	
48	
00:04:16,595 --> 00:04:22,590	
那我们读完这一块呢，我可以紧接着在下一块读，好像没什么问题。	If we read this one yet, I can immediately read the next one, it seems no problem.
	
49	
00:04:22,590 --> 00:04:27,690	
那我们随着磁盘技术的发展，它的旋转速度越来越快，	With the development of disk technology, it's rotating faster and faster.
	
50	
00:04:27,690 --> 00:04:33,750	
那也就是说它的读取的速度也是越来越快了。那这样的话呢，	That means its speed to read is also faster and faster.In this case,
	
51	
00:04:33,750 --> 00:04:41,240	
如果我们还是按照它的相邻页块这样的相邻的物理存储的话呢，	if we follow the adjacent physical storage as the adjacent page , 
	
52	
00:04:41,240 --> 00:04:45,830	
就不划算，因为我们读是一个电子的活动，	it is not worthwhile , because the read is an electronic activity.
	
53	
00:04:45,830 --> 00:04:50,420	
而这样的一个旋转呢，它是一个机械的这样的运动。	And such rotating is a mechanical movement .
	
54	
00:04:50,420 --> 00:04:58,730	
那我们读完以后，要等着它 转到这个下一块，那如果你速度太快了，它已经转过去了，	After reading that we should wait for it to go to the next one , if you are too fast, it has turn passed,
	
55	
00:04:58,730 --> 00:05:06,290	
因此呢，我们就有这样的一个交错因子。我让它的下一个相邻的这个块呢，	So then, we have such an interleave factor. I let the next adjacent block 
	
56	
00:05:06,290 --> 00:05:10,690	
只隔了若干个这样的区域，然后	separated by only several such area, and then
	
57	
00:05:10,690 --> 00:05:15,090	
它读完以后，我转过来正好就轮到下一个页块。	after reading it , I just turn on and turn to the next page block .
	
58	
00:05:15,090 --> 00:05:19,885	
那内存它是可以直接跟	It is possible that memory directly exchange data with
	
59	
00:05:19,885 --> 00:05:24,680	
CPU来进行数据的交换，它的访问速度非常的快。	CPU, and its access speed is very fast.
	
60	
00:05:24,680 --> 00:05:31,345	
但是呢，它的存储量非常小，而且呢断电以后呢这个数据会丢失，	But its storage capacity is very small, and it loses data after power fails.
	
61	
00:05:31,345 --> 00:05:38,010	
它造价也比较大。当然内存我们一般来说。我们认为它是随机访问的，	Its cost is relatively large. Of course, generally, we think the memory is randomly accessed,
	
62	
00:05:38,010 --> 00:05:42,305	
也就是说，我对一个地址的数据进行一个	In other words, the time to read data 
	
63	
00:05:42,305 --> 00:05:46,600	
读取的时候呢，它是一个常数的一个时间。	is a time of a constant.
	
64	
00:05:46,600 --> 00:05:52,685	
那内存呢有它的相应的缺点，那因此呢我们需要用到外存。	It has its respective memory disadvantage so that we need to use an external memory.
	
65	
00:05:52,685 --> 00:05:58,770	
那外存呢，它可以长久的保留这些信息，而且它的	The external memory can retain this information for a long time and its
	
66	
00:05:58,770 --> 00:06:04,100	
数据量更大。那用它来跟这个内存呢进行	amount of data is much larger. Then we use it to exchange data with the memory.
	
67	
00:06:04,100 --> 00:06:09,430	
数据的这样的交换呢，我们可以处理更大量的这样的数据。	We can handle larger amounts of data.
	
68	
00:06:09,430 --> 00:06:13,940	
但是外存呢它的访问速度非常的慢。	But the access speed of the external memory is very slow.
	
69	
00:06:13,940 --> 00:06:18,450	
我们读一个字节的这样的一个访问，	The visit of reading a byte,
	
70	
00:06:18,450 --> 00:06:25,310	
在内存里面呢，我们需要的是纳秒级的时间，也就是10亿分之一的时间，	in memory, we need a nanosecond of time, that is one billion of one second,
	
71	
00:06:25,310 --> 00:06:33,640	
而在外存里面呢，相应的它需要 毫秒的时间，也就是千分之一秒的时间。	In the external memory, the corresponding time it takes is milliseconds, which is a thousand of a second.
	
72	
00:06:33,640 --> 00:06:37,640	
这两个量级之间的差呢，大概有100万倍。	There are about 100 million times between these two orders of magnitude.
	
73	
00:06:37,640 --> 00:06:45,070	
因此我们涉及到外存的这些计算机程序的处理的时候呢，我们都需要	So the computer programs related to the external memory, we need
	
74	
00:06:45,070 --> 00:06:52,390	
想办法减少访外的次数，通过内存对数据进行一些整理，然后呢使得	to find ways to reduce the number of times visiting external memory through clearing up data
	
75	
00:06:52,390 --> 00:06:59,710	
它尽量的不要去频繁访外，然后使得我们整体的这个处理效率呢	and make our overall efficiency of this process higher. 
	
76	
00:06:59,710 --> 00:07:07,000	
更高。那我们看一下这个跟存储设备相关的一些单位。	Let us look at the units associated with the storage device .
	
77	
00:07:07,000 --> 00:07:12,430	
那像这个KB，兆，然后G,T等等。	It was like KB, MB, then G, T and so on.
	
78	
00:07:12,430 --> 00:07:17,590	
那特别是有这么一个数据，叫Googol，这实际上	There is such a particular data, called Googol, which actually
	
79	
00:07:17,590 --> 00:07:22,750	
是一个天文的数字。那它是10的100次方。	is an astronomical figure. It is 10 to the 100.
	
80	
00:07:22,750 --> 00:07:28,545	
文件呢它是记录的一个汇集，	The file is a collection of records,
	
81	
00:07:28,545 --> 00:07:34,340	
但记录在底层它其实也是二进制的这样的数据来构成的。	The record is actually saved in the binary format.
	
82	
00:07:34,340 --> 00:07:40,710	
那我们实际上是通过程序员编写这样的程序	In fact, the programmer writes such programs
	
83	
00:07:40,710 --> 00:07:44,895	
来使得我们这些二进制的这些信息呢。	to make the binary information
	
84	
00:07:44,895 --> 00:07:49,080	
具有一些逻辑的结构。那是这一种记录的形式。	in some logical structure. It was in the form of records.
	
85	
00:07:49,080 --> 00:07:54,620	
那文件它的本质呢是一种线性的结构，那也就是说它的这个记录	The file is a linear structure essentially, and that means the layout of the record
	
86	
00:07:54,620 --> 00:08:00,160	
之间的这个排放呢，在这个存储设备的这个外存里面呢，	in the external memory of the storage device,
	
87	
00:08:00,160 --> 00:08:07,585	
它都是挨个挨个进行的。那我们也可以看作是像我们老式的这种	is carried out one by one. It can be seen as old-fashioned tape, 
	
88	
00:08:07,585 --> 00:08:15,010	
录音带那它这样的一个顺序的这个结构，那我们要读到相应的这个音乐的话呢，	such a sequence structure. When we have to read the corresponding words,
	
89	
00:08:15,010 --> 00:08:21,530	
我就需要定位到那个带子的一个位置，然后我再开始挨个的读数据。	I need to navigate to a location of the tape, and then I began to read the data one by one .
	
90	
00:08:21,530 --> 00:08:27,240	
那文件的这样的组织呢，我们分为	Such organizations of the file is divided
	
91	
00:08:27,240 --> 00:08:32,950	
逻辑和物理两个层面。我们从程序员的角度来说，	into logical and physical levels. From programmer's point of view,
	
92	
00:08:32,950 --> 00:08:38,050	
我是逻辑的来看待它的。也就是说我可以	It is seen logically. That means
	
93	
00:08:38,050 --> 00:08:43,150	
是看作是由记录组成的这样的汇集。	this is seen as a collection of records.
	
94	
00:08:43,150 --> 00:08:48,065	
那这个记录呢，当然它就类似于我们在这个	This record , of course, is similar as
	
95	
00:08:48,065 --> 00:08:52,980	
C语言里面的这种结构体，那它是由若干个数据域组成的。	the "structure" in the C language, and that it is composed of a plurality of data fields.
	
96	
00:08:52,980 --> 00:08:59,690	
但是呢，在物理上来说呢，它还是本质上是这样的0，1的信息。	But in the physical level, it is essentially the information of 0,1.
	
97	
00:08:59,690 --> 00:09:04,690	
但是我们并不是一个字节一个字节的来管理它，我们是	But we do not manage it byte by byte. We 
	
98	
00:09:04,690 --> 00:09:09,690	
成块的把它分布在整个磁盘中。	put it into blocks distributed across the disk.
	
99	
00:09:09,690 --> 00:09:16,090	
那我们怎么样来管理这些磁盘里面的这些文件的	Then how do we manage these files in these discs.
	
100	
00:09:16,090 --> 00:09:22,490	
页块呢？那就是通过操作系统或者是数据库的文件管理器来组织的。	Page block? That is, the operating system or the file manager of database to organize it.
	
101	
00:09:22,490 --> 00:09:26,990	
那它们的作用就是把这个逻辑的这样的	Their role is to map the logic organization
	
102	
00:09:26,990 --> 00:09:31,490	
一个组织映射为它的物理的这个地址。	to the its physical address.
	
103	
00:09:31,490 --> 00:09:37,345	
文件从逻辑上来说主要有3种形式，	Files logically have mainly three kinds of forms,
	
104	
00:09:37,345 --> 00:09:43,200	
就是顺利定长的，顺序变长的，还有按照关键码存取	Is a fixed-length sequence, variable-length sequence, as well as access according to the key.
	
105	
00:09:43,200 --> 00:09:49,610	
这样的形式。那我们常见的物理组织形式呢，	Such forms.Our common form of physical organization,
	
106	
00:09:49,610 --> 00:09:54,800	
本质上它都是顺序的结构，也就是顺序文件。	essentially is the order of the structure, which is a sequential file.
	
107	
00:09:54,800 --> 00:10:00,930	
但是我们为了要加快对文件的一个操作处理，	However, we want to speed up the process of file.
	
108	
00:10:00,930 --> 00:10:07,465	
那我们还有一类呢是可以计算寻址，也就是散列文件。	There is another kind that is possible to calculate the address, which is the hash file.
	
109	
00:10:07,465 --> 00:10:14,000	
我能够通过你所需要的经常进行查找的这样的一个数据与，	I was able to hash the data that you frequently access,
	
110	
00:10:14,000 --> 00:10:18,710	
我把它散列到相应的这个文件的一个四方块里面。	into a square of the corresponding file.
	
111	
00:10:18,710 --> 00:10:23,420	
我们把那个文件的这个四方块呢，我们也可以说是散列桶。	The square can also be called as the hash bucket.
	
112	
00:10:23,420 --> 00:10:30,860	
那还有我们如果经常需要对某一些这样的关键码进行访问存取，	Maybe we often need to carry out some of these key.
	
113	
00:10:30,860 --> 00:10:34,790	
我们就可以把这个文件之上呢	We can establish an index structure 
	
114	
00:10:34,790 --> 00:10:41,480	
建立一种索引的结构，例如我们数据库里面常用的b树，b+树，	upon the file, for example, the B and B+ tree often used in database,
	
115	
00:10:41,480 --> 00:10:46,240	
那还有呢，在对于这些文本文件	Besides, in the process of these text files,
	
116	
00:10:46,240 --> 00:10:51,000	
处理的时候，我们以关键码的形式形成倒排文件，	we have formed inverted file in the form of key codes,
	
117	
00:10:51,000 --> 00:11:00,210	
然后使得我们在检索相应的关键码的时候，不需要一个一个文件去读，只需要从倒排文件里面拿出相应的记录。	That allows us to retrieve the corresponding key without the need of reading files one by one.just take the corresponding record from the inverted file.
	
118	
00:11:00,210 --> 00:11:04,960	
文件上头的操作跟我们	File operations is similar to
	
119	
00:11:04,960 --> 00:11:10,680	
一般的数据结构的操作呢是类似的，就我们常见的是增删查改。	the operations of general data structure, it is our common CRUD.
	
120	
00:11:10,680 --> 00:11:14,860	
那还有一个特别重要的就是排序，	There is a particularly important sort,
	
121	
00:11:14,860 --> 00:11:21,870	
因为我们在前面介绍了对于这样的文件我们如果是需要加快它的一个	Because we have previously introduced in order to speed up the read of such files
	
122	
00:11:21,870 --> 00:11:28,880	
读取的速度呢，我们可能要建立索引。那这个索引就是对一些相应的数据域进行的。	we might want to create the index. This index is upon some of the corresponding data domain.
	
123	
00:11:28,880 --> 00:11:35,820	
那这些数据域，我们最好是事先对它进行了排序整理，这样我们的索引呢就更加高效。	These data fields need to be sort in advance, so that our index is even more efficient.
	
124	
00:11:35,820 --> 00:11:41,710	
那我们看一下文件的标准输入输出流。	Then we look at the standard input and output streams of file.
	
125	
00:11:41,710 --> 00:11:49,380	
那所谓标准输入，就是我们键盘的这个输入；所谓标准输出呢，就是屏幕上头的输出。	That so-called standard input is our keyboard input ; so-called standard output is the screen output.
	
126	
00:11:49,380 --> 00:11:54,405	
那相应于这个标准的输入输出，	That corresponding to the standard input and output,
	
127	
00:11:54,405 --> 00:11:59,430	
我们派生出对文件的相应的输入输出。	We derive the corresponding file input and output .
	
128	
00:11:59,430 --> 00:12:06,370	
那我们再来看一下相关的文件类。	Then we take another look at the relevant file classes.
	
129	
00:12:06,370 --> 00:12:12,420	
那对文件的操作，我们最主要的首先是一个定位。	For the file operation, the most important is to position.
	
130	
00:12:12,420 --> 00:12:18,685	
我们把文件指针给它定到相应的这个位置，那这个位置呢，	We set the file pointer to the appropriate location, and at this position,
	
131	
00:12:18,685 --> 00:12:24,950	
是可以进行这样的相应的读和写的操作。	we can perform corresponding read and write operations.
	
132	
00:12:24,950 --> 00:12:29,125	
那我们可以看到，这个函数里面呢	Then we can see , in this function
	
133	
00:12:29,125 --> 00:12:33,300	
就有好多个函数都是关于定位的操作。	there are a good number of functions about the positioning operating.
	
134	
00:12:33,300 --> 00:12:37,310	
另外还有读和写的操作。	There are also read and write operations.
	
135	
00:12:37,310 --> 00:12:44,840	
为了加快这些外存数据的访问速度，	In order to speed up the access of these external memory data
	
136	
00:12:44,840 --> 00:12:49,605	
那我们就考虑，有可能我正在访问的或者是说	It is possible that the data I am visiting or
	
137	
00:12:49,605 --> 00:12:54,370	
前面曾经访问过的这些数据呢可能将来还要访问。	I had visited before may be visited again in the future.
	
138	
00:12:54,370 --> 00:13:00,430	
那我们就可以把它在内存里面呢开辟一个缓冲区。	Then we can open a buffer in memory and put it in.
	
139	
00:13:00,430 --> 00:13:09,240	
那我把这些数据放在那。那我将来再要访问的时候呢，我就可以直接从缓冲区里面去读。那若干个这样的	When I visit again in the future,I can read directly from the buffer. Several such 
	
140	
00:13:09,240 --> 00:13:13,650	
缓冲区呢，它就组成了这样的缓冲池。	buffers form a buffer pool.
	
141	
00:13:13,650 --> 00:13:18,735	
我们如果是还有一些信息，还需要放到这个	When there is some information that we also need to put in
	
142	
00:13:18,735 --> 00:13:23,820	
缓冲区里头来的时候，也就是我这个缓冲区	the buffer and the buffer is overflow,
	
143	
00:13:23,820 --> 00:13:31,230	
有可能会出现一个溢出的情况，那我们就要有这样的淘汰的策略，	we have to have this elimination strategy,
	
144	
00:13:31,230 --> 00:13:36,300	
把这个缓冲区里面的一些不常用的这种块呢	and relieve such blocks those are visited less frequently.
	
145	
00:13:36,300 --> 00:13:41,370	
给它释放掉，然后来存到我们新的页的信息。	and then keep the information in our new page.
	
146	
00:13:41,370 --> 00:13:48,870	
那有几种策略。先进先出呢就是我最先进来的我最先给它淘汰。	There are several strategies. FIFO is that the first in will be eliminated first.
	
147	
00:13:48,870 --> 00:13:55,710	
然后但这种情况呢我可能前面最先进来的，但是我马上我又轮到要访问了，	But there is a condition that for the first to come in, I'll have to visit it again,
	
148	
00:13:55,710 --> 00:14:02,870	
那怎么办呢？我就有一个新的办法就是说按照最不频繁使用来淘汰。	How to do it? I have a new approach that is according to the least frequently used.
	
149	
00:14:02,870 --> 00:14:10,120	
但是如果我有一个块，可能被恶狠狠的使用了10000次，那后面呢我又不怎么用了，	But if I have a block that may be used 10,000 times ferociously, but I will seldom visit it later.
	
150	
00:14:10,120 --> 00:14:15,435	
那那个用的次数最多的块呢，它永远不淘汰，这也是有问题的。	This used most block will never be eliminated, and this is a problem.
	
151	
00:14:15,435 --> 00:14:20,750	
那我们最常用的就是，“最近最少使用”这样的策略，也就是说，	Our most popular is the "least recently used" strategy, that is,
	
152	
00:14:20,750 --> 00:14:25,010	
我不断的轮循，那些没有使用的呢我就给它	I do the polling constantly, and for those not used, I 'll not give it a mark
	
153	
00:14:25,010 --> 00:14:29,270	
不标记。那使用的呢我给它标记上。那在	For those used I will mark it on.
	
154	
00:14:29,270 --> 00:14:34,760	
这个进行这样的一个寻找可淘汰的缓冲区的时候呢，	In this way to find those can be eliminated,
	
155	
00:14:34,760 --> 00:14:40,250	
我们那些“最近正在使用”的就可以避免被淘汰。	those "being recently used" can be avoided eliminated.
	
156	
00:14:40,250 --> 00:14:47,060	
那我们最后留一下思考。大家查询一下这些	Then leave you some thinking. You see about
	
157	
00:14:47,060 --> 00:14:50,770	
存储设备它们的每字节的价格。	how much is such storage device per byte.
	
158	
00:14:50,770 --> 00:14:56,825	
另外呢，特别差询一下主流的硬盘的一些性能指标。	In addition, particularly see about the performance indicator of the mainstream hard drive.
	
159	
00:14:56,825 --> 00:15:02,880	
我们这里面比如说，像这个磁盘旋转速度啊，还有	For example, the disk rotation speed , 
	
160	
00:15:02,880 --> 00:15:06,770	
这样的寻道时间啊，交错因子啊等等这些指标。	and track seeking time, interleave factor, etc.
	
161	
00:15:06,770 --> 00:15:12,640	
谢谢大家。	Thank you.