联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2018-04-23 04:06

Part I. Paper-and-Pencil Questions

1. Assume a hard drive with the following specifications: 1,024 bytes/sector, 8

sectors/track, 10 tracks/cylinder, 18 msec for average seek time, 8.3 msec for average

latency time, and 1229 bytes/msec for transfer rate. Consider a file of 8,000 records, each

of size 200 bytes. Suppose that the file is contiguously stored on the drive, without

records spanning between sectors. Further, assume that there are roughly one average

seek and one average latency for each sequential access of the file. Indicate briefly how

your answers are derived for each of the following questions:

(a) Using the external merge-sort method, approximately how many disk accesses are

required to sort the file by the primary key. Assume that the RAM has 4 buffers

available, each of which contains 1024 bytes. When merging sorted runs, you can use

three buffers for the input, each buffer for a run, and one buffer for the output. How

much is the total I/O access time for the internal sort phase? Note that the access time is

only required for the internal sort phase, but the number of disk accesses is required for

the entire merge-sort process.

(b) Assume that each primary key takes 40 bytes and each pointer, 5 bytes. Consider the

creation of a sparse index for the above sorted file.

(i) How many levels does the index contain, and how many blocks are required at

each level? Note that you can build a higher-level index based on an existing

index until the top-level index becomes small enough to be stored in a single

block.

(ii) Answer question (i) again if the primary keys can be compressed from 40 bytes to

15 bytes. How much space is saved for the entire index in terms of blocks?

2. A PARTS file with Part# as the key field includes records with the following Part#

values:

(2, 3, 5, 7, 11, 17, 19, 23, 29, 31)

(a) Construct B-trees of orders 4 and 8, respectively, for the above sequence of key

values. Indicate explicitly the relative page numbers for each of the B-tree nodes

created.

2

(b) For the B-tree of order 4 constructed in (a), show how the tree is changed after

inserting the key values 9, 10, and 8.

(c) For the B-tree modified in (b), show how the tree is further changed after deleting the

key values: 23, 11, and 19. Note that when deleting a key value, we always try

redistribution before concatenation, and in both cases, we always try the left sibling

before the right sibling.


Part II. Programming Exercise

3. Implement a C/C++ program that will create and maintain a B-tree index for a large

data file. Each B-tree page contains 512 bytes, and you can assume fixed-length fields for

the key and the byte-offset to a data record when designing an appropriate B-tree page

structure for the data set specified at the end of this exercise. The data file consists of

variable-length records, each of which is started with a length indicator and followed with

a sequence of fields separated by the delimiter (“|”). Both the data file and the B-tree

index should have their own header records. For the data file, the header record should

contain a length-indicator, B-tree index name, flag for index modification, the number of

records, and the file size. For B-tree index, the header record should have a lengthindicator,

RRN for the root page, and RRN for the last page.

You should support an interactive command interpreter that takes a command

from a user and performs the required task. A good user interface is important: it does

not have to be fancy, but should be friendly with easy-to-follow instructions. In addition,

defensive programming is desirable: if the user makes a mistake, he or she should be

allowed to try again. The set of commands to be supported are described in the

following:

(a) insert: add the information of a new data record into the B-tree index and the data

file. Note that for B-tree index, redistribution is not normally required for insertion.

(b) load: sequentially load the initial set of data records from a text file and display the

total time in msec used for the loading process. You can implement this command in

terms of insertions by adding one record at a time.

(c) search: look for a particular record by the primary key and if found, display its content

to the screen.

(d) traverse: start with the root page and display all the B-tree pages in the depth-first

ordering.

(e) report: sequentially display all the data records in the increasing order of the primary

keys.

3

(f) quit: exit the command interpreter.

Note that before entering and after leaving the command interpreter, you need to open and

close the B-tree index and the data file, and the “load” command is only allowed when

the B-tree index and the data file are truly empty.

Sample Data Set: To make it easier for you to test your program, a sample data set is

provided on CourseLink. It is a collection of English words, along with their

pronunciations, stress markings, and indications of whether they are

regular/foreign/irregular. You can use the word field as the primary key, since no two

words spell the same in the sample data set. There are two data files available:

nettalk.data.huge and nettalk.data.small. You can use the small data file to test and debug

your program, and then use the huge data file for testing when your program is reasonably

robust. There is also another file: nettalk.data.format, which explains the data format in

details.

Submission Requirements: Same as described in Assignment One.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp