Project 4: Implementing a Simple File System on Top of a Virtual Disk
Project Goals
The goals of this project are:
• to implement a simple file system on top of a virtual disk
• to understand implementation details of file systems
Implementing a simple file system
The goal of this project is to implement a simple file system on top of a virtual disk. To this end,
you will implement a library of functions that offers a set of basic file system calls (such as
open, read, write, ...) to applications. The file data and file system meta-information will be
stored on a virtual disk. This virtual disk is actually a single file that is stored on the "real" file
system provided by the Linux operating system. That is, you are basically implementing your file
system on top of the Linux file system.
To create and access the virtual disk, we have provided a few definitions and helper functions
that you can find in this header file and this source file. Note that, in your library, you are not
allowed to create any "real" files on the Linux file system itself. Instead, you have to use the
provided helper functions and store all the data that you need on the virtual disk. As you can
see by looking at the provided header and source files, the virtual disk has 16,384 blocks, and
each block holds 4KB. You can create an empty disk, open and close a disk, and read and write
entire blocks (by providing a block number in the range between 0 and 16,383 inclusive).
Your file system has to support a directory hierarchy, to at least one level below the root
directory. Files are stored in a directory and on the virtual disk. In addition, your file system
does not have to store more than 256 files (of course, you can create and delete files, and
deleted files do not count against this 256 file limit). Finally, out of the 16384 blocks available
on disk, only 8,192 must be reserved as data blocks. That is, you have ample of space to store
your meta-information. However, you have to free data blocks (make them available again)
when the corresponding file is deleted. The maximum file size is 32 megabytes (all 8,192 data
blocks, each with 4KB).
To manage your file system, you have to provide the following three functions:
int make_fs(char *disk_name);
This function creates a fresh (and empty) file system on the virtual disk with name disk_name.
As part of this function, you should first invoke make_disk(disk_name) to create a new disk.
Then, open this disk and write/initialize the necessary meta-information for your file system so
that it can be later used (mounted). The function returns 0 on success, and -1 when the disk
disk_name could not be created, opened, or properly initialized.
int mount_fs(char *disk_name);
This function mounts a file system that is stored on a virtual disk with name disk_name. With
the mount operation, a file system becomes "ready for use." You need to open the disk and
then load the meta-information that is necessary to handle the file system operations that are
discussed below. The function returns 0 on success, and -1 when the disk disk_name could not
be opened or when the disk does not contain a valid file system (that you previously created
with make_fs).
int umount_fs(char *disk_name);
This function unmounts your file system from a virtual disk with name disk_name. As part of
this operation, you need to write back all meta-information so that the disk persistently reflects
all changes that were made to the file system (such as new files that are created, data that is
written, ...). You should also close the disk. The function returns 0 on success, and -1 when the
disk disk_name could not be closed or when data could not be written to the disk (this should
not happen). If there are any open file descriptors (that point to files in the file system that is
about to be unmounted), umount_fs will close them.
It is important to observe that your file system must provide persistent storage. That is, assume
that you have created a file system on a virtual disk and mounted it. Then, you create a few files
and write some data to them. Finally, you unmount the file system. At this point, all data must
be written onto the virtual disk. Another program that mounts the file system at a later point in
time must see the previously created files and the data that was written. This means that
whenever umount_fs is called, all meta-information and file data (that you could temporarily
have only in memory; depending on your implementation) must be written out to disk.
In addition to the management routines listed above, you are to implement the following file
system functions (which are very similar to the corresponding Linux file system operations).
These file system functions require that a file system was previously mounted.
int fs_open(char *name);
The file specified by name (in this case, name is a path to the file that is to be opened, including
the actual filename part of the path) is opened for reading and writing, and the file descriptor
corresponding to this file is returned to the calling function. If successful, fs_open returns a nonnegative
integer, which is a file descriptor that can be used to subsequently access this file.
Note that the same file (file with the same name) can be opened multiple times. When this
happens, your file system is supposed to provide multiple, independent file descriptors. Your
library must support a maximum of 64 file descriptors that can be open simultaneously.
fs_open returns -1 on failure. It is a failure when the file with name cannot be found (i.e., it has
not been created previously or is already deleted). It is also a failure when there are already 64
file descriptors active. When a file is opened, the file offset (seek pointer) is set to 0 (the
beginning of the file).
int fs_close(int fildes);
The file descriptor fildes is closed. A closed file descriptor can no longer be used to access the
corresponding file. Upon successful completion, a value of 0 is returned. In case the file
descriptor fildes does not exist or is not open, the function returns -1.
int fs_create(char *name);
This function creates a new file with name name in your file system (name is the path to the file
including the name of the file itself). The file is initially empty. The maximum length for a file
name is 15 characters (this is also the maximum length of a directory name). Also, there can be
at most 256 files in the directory. Upon successful completion, a value of 0 is returned.
fs_create returns -1 on failure. It is a failure when the file with name already exists (using the
full path specified in name), when the file name is too long (it exceeds 15 characters for the
directory name and 15 characters for the name of the file), or when there are already 256 files
present in the specified directory. Note that to access a file that is created, it has to be
subsequently opened.
int fs_delete(char *name);
This function deletes the file with the path and name name from the directory of your file
system and frees all data blocks and meta-information that correspond to that file. The file that
is being deleted must not be open. That is, there cannot be any open file descriptor that refers
to the file name. When the file is open at the time that fs_delete is called, the call fails and the
file is not deleted. Upon successful completion, a value of 0 is returned. fs_delete returns -1 on
failure. It is a failure when the file with name does not exist. It is also a failure when the file is
currently open (i.e., there exists at least one open file descriptor that is associated with this
file).
int fs_mkdir(char *name);
This function attempts to create a directory with the name name. fs_mkdir() returns zero on
success, or -1 if an error occurred. It is a failure when the directory with name already exists),
when the file name is too long (it exceeds 15 characters for the directory name), or when there
are already 256 files present in the directory.
int fs_read(int fildes, void *buf, size_t nbyte);
This function attempts to read nbyte bytes of data from the file referenced by the descriptor
fildes into the buffer pointed to by buf. The function assumes that the buffer buf is large
enough to hold at least nbyte bytes. When the function attempts to read past the end of the
file, it reads all bytes until the end of the file. Upon successful completion, the number of bytes
that were actually read is returned. This number could be smaller than nbyte when attempting
to read past the end of the file (when trying to read while the file pointer is at the end of the
file, the function returns zero). In case of failure, the function returns -1. It is a failure when the
file descriptor fildes is not valid. The read function implicitly increments the file pointer by the
number of bytes that were actually read.
int fs_write(int fildes, void *buf, size_t nbyte);
This function attempts to write nbyte bytes of data to the file referenced by the descriptor
fildes from the buffer pointed to by buf. The function assumes that the buffer buf holds at least
nbyte bytes. When the function attempts to write past the end of the file, the file is
automatically extended to hold the additional bytes. It is possible that the disk runs out of
space while performing a write operation. In this case, the function attempts to write as many
bytes as possible (i.e., to fill up the entire space that is left). The maximum file size is 32 Mbytes
(which is, 8192 blocks, each 4K). Upon successful completion, the number of bytes that were
actually written is returned. This number could be smaller than nbyte when the disk runs out of
space (when writing to a full disk, the function returns zero). In case of failure, the function
returns -1. It is a failure when the file descriptor fildes is not valid. The write function implicitly
increments the file pointer by the number of bytes that were actually written.
int fs_get_filesize(int fildes);
This function returns the current size of the file pointed to by the file descriptor fildes. In case
fildes is invalid, the function returns -1.
int fs_lseek(int fildes, off_t offset);
This function sets the file pointer (the offset used for read and write operations) associated
with the file descriptor fildes to the argument offset. It is an error to set the file pointer beyond
the end of the file. To append to a file, one can set the file pointer to the end of a file, for
example, by calling fs_lseek(fd, fs_get_filesize(fd));. Upon successful completion, a value of 0 is
returned. fs_lseek returns -1 on failure. It is a failure when the file descriptor fildes is invalid,
when the requested offset is larger than the file size, or when offset is less than zero.
int fs_truncate(int fildes, off_t length);
This function causes the file referenced by fildes to be truncated to length bytes in size. If the
file was previously larger than this new size, the extra data are lost and the corresponding data
blocks on disk (if any) must be freed. It is not possible to extend a file using fs_truncate. When
the file pointer is larger than the new length, then it is also set to length (the end of the file).
Upon successful completion, a value of 0 is returned. fs_lseek returns -1 on failure. It is a failure
when the file descriptor fildes is invalid or the requested length is larger than the file size.
Implementation
Filesystem
In principle, you can implement the file system in any way that you want (as long as 8,192
blocks of the disk remain available to store file data). However, it might be easier when you
borrow ideas from existing file system designs. We recommend you model your file system
after the FAT (file allocation table) design, although it is also possible (though likely more
complex) to use a Unix (inode)-based design.
In general, you will likely need a number of data structures on disk, including a super block, a
root directory, information about free and empty blocks on disk, file meta-information (such as
file size), and a mapping from files to data blocks.
The super block is typically the first block of the disk, and it stores information about the
location of the other data structures. For example, you can store in the super block the
whereabouts of the file allocation table, the directory, and the start of the data blocks.
The directory holds the names of the files. When using a FAT-based design, the directory also
stores, for each file, its file size and the head of the list of corresponding data blocks. When you
use inodes, the directory only stores the mapping from file names to inodes.
The file allocation table (FAT) is convenient because it can be used to keep track of empty
blocks and the mapping between files and their data blocks. When you use an inode-based
design, you will need a bitmap to mark disk blocks as used and an inode array to hold file
information (including the file size and pointers to data blocks).
In addition to the file-system-related data structures on disk, you also need support for file
descriptors. A file descriptor is an integer in the range between 0 and 63 (inclusive) that is
returned when a file is opened, and it is used for subsequent file operations (such as reading
and writing). A file descriptor is associated with a file, and it also contains a file offset (seek
pointer). This offset indicates the point in the file where read and write operations start. It is
implicitly updated (incremented) whenever you perform a fs_read or fs_write operation, and it
can be explicitly moved within the file by calling fs_lseek. Note that file descriptors are not
stored on disk. They are only meaningful while an application is running and the file system is
mounted. Once the file system is unmounted, file descriptors are no longer meaningful (and,
hence, should be all closed before a call to umount_fs).
Test Program
You should write a small program that demonstrates the creation of the virtual disk and the file
system. To demonstrate your file system implementation, you are required to develop a
program that uses all of the functions in the filesystem. This program should mount the virtual
disk and then demonstrate copying a file from the real OS file system to your file system and
copying a file from your file system to the OS file system.
You should be able to test the persistence of your file system by re-running the test application
and then showing the directory and virtual disk contents.
Deliverables
1. You will submit all files that are part of your project via the assignment in Canvas.
2. You should submit all source files required to build your file system implementation and
the test programs.
3. Your code must be implemented in C/C++. You are to include disk.h as well as disk.c to
build a complete executable, but do not make any changes to these two files. The
source code should be well commented and readable.
4. As in previous projects, you are to use Github as part of your development process,
making regular commits as you design and develop your solution.
5. Include a README with this project. A separate component of the project is to describe
the detailed design and development of this project (Project 4A). The README should
be an overview of the features of the library components and the test application.
Include a discussion of your component testing during the development process. If you
had problems, tell us why and what.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。