Project 3: Virtual Quicksort
Due Apr 14 by 11:59pm
Points 100
Submitting an external tool
You are allowed to work with a partner on this project!
Assignment:
Please download the starter code from here (https://canvas.vt.edu/courses/185343/files/32859831? wrap=1) (https://canvas.vt.edu/courses/185343/files/32859831/download?download_frd=1) .
You will implement a sorting algorithm on a file. The input data file will consist of many 4-byte
records, with each record consisting of two 2-byte (short) integer values in the range 1 to 30,000. The first 2-byte field is the key value (used for sorting) and the second 2-byte field contains a data value. The input file is guaranteed to be a multiple of 4096 bytes. All I/O operations will be done on blocks of size 4096 bytes (i.e., 1024 logical records).
Warning: The data file is a binary file, not a text file.
Your job is to sort the file (in ascending order), using a modified version of Quicksort. The
modification comes in the interaction between the Quicksort algorithm and the file storing the data. The array being sorted will be the file itself, rather than an array stored in memory. All accesses to the file will be mediated by a buffer pool. The buffer pool will store 4096-byte blocks (1024 records). The buffer pool will be organized using the Least Recently Used (LRU) replacement scheme. See OpenDSA Module 9.4 (https://canvas.vt.edu/courses/190133/modules/items/2703388)for more
information about buffer pools.
Note that you are not implementing an external sorting algorithm, as described in Module 9.6 of OpenDSA (https://canvas.vt.edu/courses/190133/assignments/1965087?module_item_id=2703391) . What you are implementing is a standard in-memory Quicksort that has been modified to use a “virtual memory” in the form. of a large array stored on disk. The primary modification to your
Quicksort algorithm from the in-memory version is to change the in-memory array accesses to accesses to the disk file (these accesses actually come through the buffer pool).
Design Considerations:
The primary design concern for this project will be the interaction between the logical array as
viewed by the Quicksort algorithm, and the physical representation of the array as implemented by the disk file mediated by the buffer pool. In essence, the disk file will be the array, and all accesses to the array from the Quicksort algorithm will be in the form of requests to the buffer pool for
specific bytes from the file. You should implement the buffer pool interface using the “message
passing” approach shown in Module 9.4.1.3
(https://canvas.vt.edu/courses/190133/modules/items/2703388) . The buffer pool interface should pass byte arrays back and forth, not “records” as the Quicksort algorithm understands them.
Be aware that performance does play an issue in the grading for this program—some of the tests will be timed. If your program takes significantly longer than it should, then it will be penalized. So another design consideration will be the efficiency of your program, because some of the test
cases are timed. If you are careless about how you implement the buffer pool, or in how much
unnecessary disk visits (reads or writes) you do within your Quicksort implementation, then you will not be able to meet the time requirements. We are not going to check exactly what blocks are
swapped in and out of the buffer pool. Therefore, there is room for different approaches to tuning your Quicksort and buffer pool implementations as needed to meet the efficiency requirements.
Invocation and I/O Files:
The program will be invoked from the command-line as:
%> java Quicksort <data-file-name> <numb-buffers> <stat-file-name> |
where:
The file where you have your main() method must be called Quicksort.java
<data-file-name> is the file to be sorted. The sorting takes place in that file, so this program does modify the input data file. Be careful to keep a copy of the original when you do your testing.
<numb-buffers> determines the number of buffers allocated for the buffer pool. This value will be in the range 1–20.
<stat-file-name> is the name of a file that your program will generate to store runtime statistics; see below for more information.
At the end of your program, the data file (on disk) should be in a sorted state. Do not forget to flush
buffers from your buffer pool as necessary at the end, or they will not update the file correctly. In addition to sorting the data file, you will generate and output some statistics about the execution of your program. Write these statistics to <stat-file-name> . Make sure your program DOES NOT overwrite <stat-file-name> each time it is run; instead, have it append new statistics to the end of the file. The information to write is as follows.
1. The name of the data file being sorted.
2. The number of cache hits, or times your program found the data it needed in a buffer and did not have to go to the disk.
3. The number of disk reads, or times your program had to read a block of data from disk into a buffer.
4. The number of disk writes, or times your program had to write a block of data to disk from a buffer.
5. The time that your program took to execute the Quicksort. Put two calls to the standard Java timing method System.currentTimeMillis() in your program, one when you call the sort function, and one when you return from the sort function. This method returns a long value. The difference between the two values will be the total runtime in milliseconds. You should ONLY time the sort, and not the time to write the program output as described above.
Programming Standards:
You must conform. to good programming/documentation standards. Web-CAT will provide
feedback on its evaluation of your coding style, and be used for style grading. Beyond meeting Web-CAT’s checkstyle. requirements, here are some additional requirements regarding programming standards.
You should include a comment explaining the purpose of every variable or named constant you use in your program.
You should use meaningful identifier names that suggest the meaning or purpose of the
constant, variable, function, etc. Use a consistent convention for how identifier names appear, such as “camel casing” .
Always use named constants or enumerated types instead of literal constants in the code. Source files should be under 600 lines.
There should be a single class in each source file. You can make an exception for small inner classes (less than 100 lines including comments) if the total file length is less than 600 lines. We can’thelp you with your code unless we can understand it. Therefore, you should no bring your code to the GTAs or the instructors for debugging help unless it is properly documented and exhibits good programming style. Be sure to begin your internal documentation right from the start.
You may only use code you have written, either specifically for this project or for earlier pro grams, or code provided by the instructor. Note that the OpenDSA code is not designed for the specific purpose of this assignment, and is therefore likely to require modification. It may, however, provide a useful starting point.
Deliverables:
You will implement your project using Eclipse, and you will submit your project using the Eclipse plugin to Web-CAT. Links to the Web-CAT client are posted at the class website. If you make
multiple submissions, only your last submission will be evaluated unless you arrange otherwise with the GTA. There is no limit to the number of submissions that you may make.
You are required to submit your own test cases with your program, and part of your grade will be
determined by how well your test cases test your program, as defined by Web-CAT’s evaluation of code coverage. Of course, your program must pass your own test cases. Part of your grade will
also be determined by test cases that are provided by the graders. Web-CAT will report to you
which test files have passed correctly, and which have not. Note that you will not be given a copy of these test files, only a brief description of what each accomplished in order to guide your own
testing process in case you did not pass one of our tests.
When structuring the source files of your project, use a flat directory structure; that is, your source files will all be contained in the project “src” directory. Any subdirectories in the project will be ignored.
You are permitted to work with a partner on this project. While the partner need not be the same as who you worked with on any other projects this semester, you may only work with a single partner during the course of one project unless you get special permission from the course instructor.
When you work with a partner, then only one member of the pair will make a submission. Be sure both names are included in the documentation. Whatever is the final submission from either of the pair members is what we will grade unless you arrange otherwise with the GTA.
Milestones
· Milestone 1: Sunday, March 24
Pass 1 Web-CAT instructor reference test
· Milestone 2: Sunday, Mar 31
Pass 4 Web-CAT instructor reference tests
Have at least 1 point for Style/Coding
· Milestone 3: Sunday, April 07
Pass 8 Web-CAT instructor reference tests.
。 Have at least 5 points for Style/Coding
Pledge:
Your project submission must include a statement, pledging your conformance to the Honor Code requirements for this course. Specifically, you must include the following pledge statement near the beginning of the file containing the function main() in your program. The text of the pledge will also be posted online.
// On my honor:
//
// - I have not used source code obtained from another current or // former student, or any other unauthorized source, either // modified or unmodified.
//
// - All source code and documentation used in my program is // either my original work, or was derived by me from the // source code published in the textbook for this course. //
// - I have not discussed coding details about this project with // anyone other than my partner (in
the case of a joint // submission), instructor, ACM/UPE tutors or the TAs assigned // to this course. I understand that I may discuss the concepts // of this program with other students, and that another student // may help me debug my program so long as neither of us writes // anything during the discussion or modifies any computer file // during the discussion. I have violated neither the spirit nor // letter of this restriction.
Programs that do not contain this pledge will not be graded
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。