CSC 230 Project 4
Movie Watch List Manager
For this project, you get to write a program that will help you manage a list of movies that you would like to watch. It will
maintain a database of available movies, read in at program start-up. By entering commands for the program, the user can view
the entire movie database, or just the movies from a particular range of years or those with a title containing a given string. The
user can choose movies to add to their watch list and later remove them if they change their mind.
The sample execution below shows how you can run the program. The bold text is input typed by the user. Here, we're telling it
to read a movie database from the input file, list-d.txt. We ask it to output the entire database (11 movies in this case), and
then we ask it to list just the movies with a year between 1990 and 1999. Then we ask it to display the watch list (which is
empty), so we add a few movies to our watch list and display the list again. Finally, we remove one movie from the list, take
another look at the list, and enter the quit command to terminate the program.
$ ./movies list-d.txt
cmd> database
database
ID Title Year Len
4511 Aladdin 1992 90
4772 Alice in Wonderland 1951 75
18145 Cinderella 1950 74
61360 Mulan 1998 88
70281 Pinocchio 1940 88
70767 Pocahontas 1995 81
82766 Snow White and the Seven Dwarfs 1937 83
99053 The Lion King 1994 88
111278 Toy Story 1995 81
111279 Toy Story 2 1999 92
111280 Toy Story 3 2010 103
cmd> year 1990 1999
year 1990 1999
ID Title Year Len
4511 Aladdin 1992 90
99053 The Lion King 1994 88
70767 Pocahontas 1995 81
111278 Toy Story 1995 81
61360 Mulan 1998 88
111279 Toy Story 2 1999 92
cmd> list
list
List is empty
cmd> add 99053
add 99053
cmd> add 61360
add 61360
cmd> add 111278
add 111278
cmd> list
list
ID Title Year Len
99053 The Lion King 1994 88
61360 Mulan 1998 88
111278 Toy Story 1995 81
cmd> remove 61360
remove 61360
cmd> list
list
ID Title Year Len
99053 The Lion King 1994 88
111278 Toy Story 1995 81
cmd> quit
quit
As with recent projects, you'll be developing this one using git for revision control. You should be able to just unpack the starter
into the p4 directory of your cloned repo to get started. See the Getting Started section for instructions.
This project supports a number of our course objectives. See the Learning Outcomes section for a list.
The project uses a list of movies based on the title.basics.tsv.gz file previously retrieved from the IMDb (Internet Movie
Database) website.
Rules for Project 4
You get to complete this project individually. If you're unsure what's permitted, you can have a look at the academic integrity
guidelines in the course syllabus.
In the design section, you'll see some instructions for how your implementation is expected to work. Be sure you follow these
rules. It's not enough to just turn in a working program; your program has to follow the design constraints we've asked you to
follow. For this assignment, we're putting some constraints on the functions you'll need to define, the data structures you'll use
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 2/12
and how you're going to organize your code into components. Still, you will have lots of opportunities to design parts of your
solution and to create additional functions to simplify your implementation.
Requirements
This section says what your program is supposed to be able to do, and what it should do when something goes wrong.
Program Execution
The movies program expects one or more filenames on the command line. Each of these files should contain a list of movies that
the program can read into its database at startup. If the program is run with invalid command-line arguments (e.g., no filenames
given on the command line), it should print the following usage message to standard error and exit with a status of 1.
usage: movies <movie-list>*
If the program can't open one of the given files for reading, it should print the following message to standard error and exit with
a status of 1. Here, filename is the name of the file given on the command line. The program should report the first filename on
the command line that it can't successfully open (i.e., if there are multiple filenames on the command line that can't be opened,
it just needs to report this error for the first one that can't be opened).
Can't open file: filename
Movie List Format
At program start-up, the movies program reads in a database of movies. On the command line, it is given one or more filenames
for files containing movie lists, stored in a particular format. Each line of a movie list file describes one movie. A movie
description consists of five fields, with tab characters (ASCII 0x09) separating the fields. The first field is an integer ID unique to
the given movie. The next field is a title for the movie (a string). The next field is a integer year of when the movie was released.
The next field is an integer length of the movie in minutes and the last field is a string listing various genres for the movie. The
list of genres is a comma-separated list of strings. None of the fields will contain a tab character, and none of them will be empty.
Format of a line of a Movie List
Your program will only use the genre field if you're doing the extra credit part of the assignment. Otherwise, your program can
just skip over this field as it reads in a movie list. The possible genres are Action, Adventure, Animation, Biography, Comedy,
Crime, Documentary, Drama, Family, Fantasy, History, Horror, Musical, Mystery, Romance, Sci-Fi, Sport, Thriller, War, and
Western.
Some of the title fields are fairly long, but you will only need to store the first 38 characters of the title. This is described in the
Movie Listing section below.
The program should process the movie list files in the same order they are given on the command line. Within each movie list, it
should process movies in order from the first line to the last line of the file. The order for processing these files matters for error
reporting. If there is something wrong with a movie list, the program should report the first error it encounters.
A movie list file can contain any number of movie descriptions, one per line. If the format of the movie list is invalid (e.g., if a line
is missing one of the expected fields or if one of the numeric fields can't be parsed as a number), then it should print the
following message to standard error and exit with a status of 1. Here, filename is the name of the file containing the bad movie
description.
Invalid movie list: filename
Every movie should have a unique numeric ID. If the program encounters more than one movie with the same ID (even if other
fields like the title or year are different), it should print the following message to standard error and exit with a status of 1. Here,
ID is the movie ID that occurred more than once. The program should detect duplicate IDs, whether they occur within the same
movie list file or across two different movie lists.
Duplicate movie id: ID
Watch List
As the user interacts with the movies program, they can select movies from the database to add to their watch list, a subset of
movies the user plans to watch. The database is the set of all movies available, and the watch list is the subset of the database
that the user has selected.
It's possible for the watch list to be empty (it's empty when the program starts up). It's managed like a set, so it can't contain
more than one of the same move.
Movie List Output
A few user commands are used to list movies, either from the database or from the watch list. The output format for these
reports is mostly the same. It consists of a header describing each of the four fields (like the example shown below). After the
header, the report lists one movie per line. Each movie is reported as a movie ID in a 6-character field, a movie title in a 38
character field, a movie year in a 4 character field, and finally a movie length given in a 3 character field. For the year and the
movie length, the widths of 4 and 3 are minimum field widths, so it's possible to have a year with more than four digits or a
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 3/12
length with more than three digits. For cases like these, the columns may not line up properly. Each of these fields is rightaligned
and has a single space separating them.
ID Title Year Len
8466 Avatar 2009 162
84694 Star Trek VI: The Undiscovered Country 1991 110
84702 Star Wars: Episode IV - A New Hope 1977 121
94055 The Englishman Who Went Up a Hill Bu.. 1995 99
108082 The Wizard of Oz 1982 78
For movie titles that are too long to fit in their field width, you will print as much of the title as you can, and then print two
periods instead of the last two characters of the field, to indicate that the whole title was too long to fit. You can see this in the
"The Englishman Who Went Up a Hill But Came Down a Mountain" title above. Here, we print just the first 36 characters of the
title, then print two periods at the end, making the overall length exactly 38 characters.
User Commands
After start-up, the movies program reads commands typed in by the user. Each command is given as single line of user input.
For each command, the program will prompt the user with the following prompt. There's a space after the greater-than sign, but
you probably can't see it in this web page.
cmd>
After the user enters a command, the program will echo that command back to the user on the next output line. This is mostly to
help with debugging your programs. If we're capturing program output to a file, then things typed by the user don't go to the
output file (user inputs show up on the terminal, but they're not part of the program's output). By echoing each command, our
output files will include a copy of each command the user typed, making it easier to see what the program was asked to do. So,
for example, if the user typed in a command like the following, the program would echo a copy of the command on the next line:
cmd> year 1990 1999
year 1990 1999
The user can type any of 7 (or 8) available commands: database, year, title, add, remove, list and quit. There is also a
genre command that can be implemented for extra credit. These commands are described below. Each valid command starts
with one of the keywords listed above. For some commands, the keyword must be followed by one or more parameters. There
may be one or more whitespace characters at the start of the command, between the keyword and the parameters, between
parameters or at the end of the command. Any non-whitespace characters on a line following a valid command (and it's
parameters, if any) may be ignored by the program.
If the user enters an invalid command, the program should print the following message to standard output (not standard error),
ignore the command and prompt the user for another command. Invalid commands would be those that start with something
other than the 7 (or 8 for extra credit) keywords listed above, or if the command's parameters weren't correct.
Invalid command
After the first prompt, the program should print a blank line before prompting the user for another command. This is shown in
the sample execution at the start of this project description. It's just to provide a little separation between the output for
consecutive commands.
The program should terminate when it is given the quit command or when it reaches the end-of-file on standard input. In the
case of the quit command, the program should echo the command back to the user (like all the other commands). In the case of
end-of-file, there's no command to echo, so the program should just terminate.
Database command
If the user enters the database command, the program should print out all the movies in the entire database in the format
described in the "Movie List Output" section above. Movies should be sorted by their ID field, least to greatest.
If there are no movies in the database (this could happen if the program was given an empty movie list file to read), the
program should print the following message to standard output and then prompt for another command.
No matching movies
Year command
The year command requires two integer parameters, a low value and a high value. It lists movies from the database with a year
at least as high as the low value and no higher than the high value. Output should be given in the format described in the "Movie
List Output" section above, ordered by year, from low to high. Movies with the same year should be sorted by ID.
For example, the user could enter the following year command, with the following response from the program. Notice that the
two movies with a year of 1995 are ordered by ID number.
cmd> year 1990 1999
year 1990 1999
ID Title Year Len
4511 Aladdin 1992 90
99053 The Lion King 1994 88
70767 Pocahontas 1995 81
111278 Toy Story 1995 81
61360 Mulan 1998 88
111279 Toy Story 2 1999 92
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 4/12
The year command would be invalid if it was missing a parameter, or one of its parameters couldn't be parsed as an integer
value, or if its first parameter was greater than its second parameter. If the range of years doesn't contain any movies, the
program should print a line to standard output saying "No matching movies", like in the following example:
cmd> year 1978 1979
year 1978 1979
No matching movies
Title command
For this command, the user can enter a single-token string. The program will find all movies in the database that contain that
string as a substring in their title field. It will print out just these movies from the database in the format described in the "Movie
List Output" section above, with movies whose title contains or is equal to the given string listed in order of ID. A movie's title
field matches the given string even if the string is just a substring of a longer word in the movie's subject field. For example, if
the user entered "title and", then it could match a movie that had the word "and" in its title field, or one that had the word
"Wonderland" in its title field. For example,
cmd> title and
title and
ID Title Year Len
4772 Alice in Wonderland 1951 75
82766 Snow White and the Seven Dwarfs 1937 83
If there are no matching movies in the database, the title command should print the "No matching movies" message to standard
output, just like the database and year commands.
A title command would be invalid if it didn't have a string after the title keyword.
Genre command
The genre command is for extra credit. For this command, the program will need to store the list of genres for each movie (the
last field on each line in a movie list). The user can enter the genre keyword, followed by a single word (a sequence of nonwhitespace
characters). The program will find all movies in the database that contain that word as a substring in their genre
field. It will print out just these movies from the database in the format described in the "Movie List Output" section above, with
movies that match the given word listed in order of ID. A movie's genre field matches the given word even if the word is just a
substring of a longer word in the movie's genre field. For example, if the user entered "genre mat", then it could match a movie
that had the word the word "Animation" in its genre field.
If there are no matching movies in the database, the genre command should print the "No matching movies" message to
standard output, just like the database and year commands.
A genre command would be invalid if it didn't have a word after the genre keyword.
Add command
The add command is for adding movies from the database to the watch list. Movies added to the watch list are added at the end,
and adding a movie to the watch list doesn't remove it from the database; it just puts that movie on the watch list. The add
command expects a movie ID as a parameter. So, for example, the following command would add the movie with an ID of 42 to
the watch list.
add 42
An add command would be invalid if there wasn't an integer after the add keyword. If the integer doesn't match a movie ID from
the database, the program should print the following to standard output (where ID is the ID the user asked to add).
Movie ID is not in the database
If the user gives the ID of a movie that's already on the watch list, the program should print the following message to standard
output (where ID is the ID the user asked to add):
Movie ID is already on the watch list
Remove command
The remove command is for removing movies from the watch list. As a parameter, it expects the ID of the movie to be removed.
It removes that movie from the watch list, and the remaining movies stay in the same order. So, for example, the following
command would remove the movie with an ID of 42 from the watch list.
remove 42
If the remove command isn't given a valid integer as a parameter, then it is an invalid command. If the parameter is a valid
integer but doesn't match the ID of a movie on the watch list, then it should print the following message to standard output,
where ID is the ID of the movie the user asked to remove.
Movie ID is not on the watch list
List command
The list command shows the movies on the watch list in the same format described in the "Movie List Format" section above.
Notice that the watch list is ordered based on the order movies were added (not sorted by ID).
For example, running the list command might look like the following.
cmd> list
list
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 5/12
ID Title Year Len
70356 Pirates of the Caribbean: The Curse .. 2003 143
59720 Mission: Impossible 1996 110
91003 The Bourne Identity 2002 119
30675 Ferris Bueller's Day Off 1986 103
If the watch list is empty, the program should print a line to standard output saying "List is empty".
So, for example, you might get the following response from a list command:
cmd> list
list
List is empty
Quit command and termination
The quit command doesn't take any parameters. It should terminate the program. It's entered like the following:
cmd> quit
The program should also terminate successfully if it reaches the end-of-file on standard input while it's trying to read the next
command.
Design
Program Organization
Your implementation will be organized into three components. The input component will help with reading input from the movie
list files and from the user. The database component will contain code for implementing movies and the database. The movies
component will contain main, code to read in user commands and the implementation for the watch list.
Components and Dependency Structure
The input and database components will each have a header file, so other components can use types and functions defined by
these components. The figure above shows the dependency structure of the project. The database component can use code
provided by input and the main movies component can use code provided by both input and database.
Movie and Database Representation
This project is a good chance to get some experience using structs, dynamic memory allocation and resizable arrays. Each movie
will be represented by a struct with a field for each of the four values associated with a movie. The title field will be a string, and
the ID, year, and length can be stored as ints. The title field just needs to be able to store a string of up to 38 characters.
Although lots of titles are longer than this, the output of the program never reports more than 38 characters for a title, so you
won't need to store more than the first 38 characters.
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 6/12
Movie Representation
The database will be represented by its own struct, containing fields to store a resizable array of pointers to movies. Each movie
will be stored in a block of dynamically allocated memory. Inside the Database struct, you will use a resizable array of pointers to
movies to keep up with all the movies in the database. The count and capacity fields are for maintaining the resizable array, for
keeping up with how many movies are in the database and for detecting when you run out of capacity and need to grow the
array. Your resizable array should start with an initial capacity of 5, and it should double in capacity whenever the array needs to
be enlarged.
Database Representation
Extra Credit Design and Implementation
If you do the extra credit part of this project, each movie will need to store a string of genre keywords read from the movie list
files. The genres for a movie may be a long string, so we're not going to store it inside the movie struct. Instead, the string will
be stored in another block of dynamically allocated memory, and the movie struct will just keep a pointer to this string. That way,
the genre string can be exactly as long as it needs to be to hold whatever genre string is given in the movie list input.
Movie Representation with Genres
Watch List Representation
You will represent the watch list as a resizable array in the top-level movies component. Like the database, this array should
start with an initial capacity of 5 and it should double in size whenever it needs to grow.
You can represent your watch list however you want to. If you want, you can store it inside a struct, like we're doing with the
Database, or you can just use some global variables inside the movies component to keep up with the watch list. If you do
choose to use some global variables for the watch list, be sure to mark them as static. This will prevent possible name collisions
with symbols defined elsewhere in your program.
Expected Functions
As part of your implementation, you will define and use the following functions. You can define more if you want to. Just try to
put them in a component that's suitable for whatever they do and remember to mark them as static where you can (i.e., if
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 7/12
they're not used outside the component where they're defined).
Your input component only needs to have one function.
char *readLine( FILE *fp )
This function reads a single line of input from the given file and returns it as a string inside a block of dynamically allocated
memory. You can use this function to read commands from the user and to read movie descriptions from a movie list file.
Inside the function, you should implement a resizable array to read in a line of text that could be arbitrarily large. If there's
no more input to read, this function should return NULL. Since this function returns a pointer to dynamically allocated
memory, some other code will be responsible for eventually freeing that memory (to avoid a memory leak).
Your database component should have the following 7 (or 8) functions.
Database *makeDatabase()
This function dynamically allocates storage for the database, initializes its fields (to store a resizable array) and returns a
pointer to it.
void freeDatabase( Database *dat )
This function frees the memory used to store the database, including freeing space for all the movies, freeing the resizable
array of movie pointers and freeing space for the database struct itself.
void readDatabase( Database *dat, char const *filename )
This function reads all the movies from a movie list file with the given name. It makes an instance of the Movie struct for
each one and stores a pointer to that movie in the resizable array
void listAll( Database *dat )
This function lists all the movies in the database, sorted by ID number. The movies component can call this in response to
the user entering the database command.
void listYear( Database *dat, int min, int max );
This function lists all the movies with a year between the given min and max values (inclusive). Your movies component can
call this when the user enters the year command. In the output, movies should be sorted by year, and by ID if they have
the same year.
void listTitle( Database *dat, char const *title )
This function lists all the movies where the given title string occurs in the movie's title field. In the output, the movies
should be listed in order by ID. For this function (and the extra credit listGenre() function), you may find the strstr()
function useful for finding a short string inside a larger one. We will talk about this function briefly in class, but, if you want
to use it, you may need to do some reading on your own. You'll find it on page 620 of your textbook, or, if you're on a Linux
machine, you can just type man strstr at the shell prompt to look at the online documentation.
void listGenre( Database *dat, char const *genre )
You only need this function if you're doing the extra credit. It reports all movies where the given genre string occurs in the
movie's genres field. In the output, the movies should be listed in order by ID.
void listDatabase( Database *dat, bool (*test)( Movie const *movie, void const *data ), void const *data )
This is a static function in the database component. It is used by the listAll(), listYear(), listTitle(), and listGenre() functions
to actually report the list of movies in the right format. In addition to a pointer to the database, this function also takes a
pointer to a function (test) and a pointer to an arbitrary block of data (data) to let the caller tell the function which
particular movies it should print out. This is described in more detail in the "Selecting Movies to Report" section below.
Your movies component will contain main() and any other functions you need to parse command-line arguments and user
commands.
Function Visibility
Any functions that are needed by a different component should be prototyped (and commented) in the header. Functions that
don't need to be used by a different component should not be prototyped in the header, and should be marked static (given
internal linkage), so they won't be visible to any other part of the program. This is like making the function an implementation
detail of its component, something we could change if we wanted without affecting other parts of the program.
Sorting the Database
You'll use the standard library qsort() function to sort movies, either by ID or by year (and ID). Using qsort() will make the
sorting easier (and probably more efficient), but you have to help out qsort() by providing a pointer to a comparison function. We
have some examples of this in the material from lecture 12, in the slides and in the sort.c example program.
To use qsort() you'll need to think about a few things. As usual, you'll need to write your own comparison function, one that
takes two (const) void pointers, but knows that they're really pointers to two elements of the array inside the Database struct.
So, your comparison function will need to cast these void pointers to pointers of the right type before it can start looking at the
fields of the Movie objects they point to. Remember that the comparison function gets pointers to two array elements (not copies
of the values in two array elements, pointers to the elements). So, for example, since the array is full of pointers to Movie
structs, your comparison function will get two pointers to pointers to Movie instances. You have to define your comparison so it
takes two void pointer parameters, but, internally, your comparison function will know that these are really pointers to pointers
to Movies. After casting the void pointers parameters to these more specific types, you can access the fields of the Movies in
order to compare them.
You have to sort the database two different ways (for the year command vs for the database and title commands), so you will
need to implement two different comparison functions for sorting the movies in the database.
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 8/12
Parsing User Commands
We're reading user input one line at a time. After we get a string containing a command, we will need to look inside this string to
figure out what command the user typed. The sscanf() function will make it easy to do this. It works much like scanf() or
fscanf(), but it parses input from a string instead of from a file. We'll cover this function in lecture 18, but you may want to look
at the material for lecture 18 early (it should already be posted) so you can get started on the project earlier.
Remember, unlike reading from a file, sscanf() doesn't automaticall resume parsing from where it left off on the last call. For
example, you couldn't call something like sscanf( str, "%d", &x ); to read an int then call sscanf( str, "%d", &y ); again
to read the next int. If you gave sscanf() the same string in two successive calls, it would just start parsing at the start of the
string each time. If you want to parse multiple values out of the same string, you can do it all at once, like
sscanf( str, "%d%d", &x, &y );, or you could advance the pointer on successive scalls to sscanf(), as in
sscanf( str + offset, "%d", &y );. If you need to do this, the %n conversion specifier we covered in lecture 9 can be helpful.
Selecting Movies to Report
The listDatabase() function can be used to print any selected movies from the database, so, it can be used by the four
functions void listAll(), void listYear(), void listTitle(), and void listGenre() to print out the needed subset of the database. How
does it know which movies to report? Internally, it will call the provided test function for each Movie in the database. If the test
function returns true, listDatabase() should print that Movie; otherwise, it shouldn't. This lets client code use a single interface to
print any subset of the Movies. The client code just needs to provide a pointer to a function listDatabase() can use to decide what
to print and what not to print. For example, to perform the database command, you can pass in a pointer to a test function that
always returns true. To print Movies in a range of years, you can pass in a pointer to a function that checks the year and returns
true if the Movie's year is in the range. To do this, we'll need to use the data parameter to listDatabase().
Notice that listDatabase() takes a void pointer data parameter, and the test function also takes a void pointer data parameter.
This parameter is a mechanism for providing extra information the test function needs in order to do its job, like the range of
years needed for the year command. When you call listDatabase(), you can pass in a pointer to anything you want as the data
parameter (even NULL, if you don't need this parameter). The listDatabase() function should remember this parameter and will
give this same pointer to the test function every time it calls it. This gives you a way to supply a pointer to anything your test
function needs to answer the question, "Should we print this Movie?". Inside each of your test functions, you will just need to
convert the data value from a void pointer back to whatever it really points to before it can use it. This is like how the
comparison function used by qsort has to convert its void pointer parameters to a more specific type before it can use them (but
the type its converting to will be different here).
For example, to implement year 1990 1999 command, our test function needs to know two values 1990 and 1999, so it can
return true for movies with a year in this range. That's what the data parameter is for. You can put 1990 and 1999 in a little
struct, then pass the address of that struct as the last parameter to listDatabase(). You'll also give listDatabase() the address of
a test function that knows how to check the year of a movie against the range of values you provided via the data parameter.
Each time listDatabase() calls your test function, it will pass it a copy of the data pointer you provided (i.e., a pointer to a little
struct in which you stored the range, 1990 and 1999). Inside your test, you can cast the void pointer, data, back to a pointer to
your little struct, then use it to decide if the given Movie should be printed. Or, if you don't want to define a little struct to hold
the minimum and maximum years, you could use a little 2-element array, where you store the minimum and maximum years in
elements of the array.
For the title command as well as the extra credit genre command, you can use a similar technique to implement the
commands, except your test function will need a data value that tells it what the search string is (rather than a minimum and
maximum value for the year range).
This is a common trick in C, using a void pointer to pass any information you need through general-purpose code and eventually
back into code written for a specific purpose. When you take the operating systems class, you'll see a similar technique in the
POSIX threads API, to pass arbitrary data to a new thread you're creating or to get results of an arbitrary type from the thread
when it's done.
Build Automation
You get to implement your own Makefile for this project (called Makefile with a capital 'M', no filename extension). Its default
target should build your program, compiling each source file to an object file and then link the objects together into an
executable.
As usual, your Makefile should correctly describe the project's dependencies, so targets can be rebuilt selectively, based on what
source files have changed. It should also have a clean rule, to let the user easily delete any temporary files or target files that
can be rebuilt the next time make is run (e.g., the object files, the executable and any temporary output files).
In addition to the "-Wall" and "-std=c99" options that we normally include when we're compiling, be sure to include the "-g" flag.
This will be useful when you try to use gdb or valgrind to help debug the program.
Testing
The starter includes a test script, along with test input files and expected outputs. When we grade your program, we'll test it with
this script, along with a few other test inputs we're not giving you. To run the automated test script, you should be able to enter
the following:
$ chmod +x test.sh # probably just need to do this once
$ ./test.sh
This will automatically build your program (using your Makefile) and see how it does against all the tests.
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 9/12
As you develop your program, you'll want to try it out with user input and see what it's doing on individual test cases. Until your
Makefile is working, you should be able to execute the compiler directly with the following command (although this isn't as
efficient as how your Makefile builds).
$ gcc -g -Wall -std=c99 movies.c database.c input.c -o movies
To try out your program on one of the provided test cases, you can run it as follows. Here, we're saving the program's standard
output and standard error to two different files. All of the tests have an expected output file for standard output, and some of
them (the ones that exit unsuccessfully) also have an expected output on standard error.
Here, we're running test 12 by hand. After running the program, we check its exit status to make sure it ran successfully (it
should for this test). Then we check the output file to make sure we got what was expected. This particular test shouldn't print
any output to the standard error stream.
./movies list-d.txt < input-12.txt > output.txt 2> stderr.txt
$ echo $?
0
$ diff output.txt expected-12.txt
$ cat stderr.txt
Memory Error and Leaks
On any test that runs successfully, your program is expected to free all of the dynamically allocated memory it allocates and
close any files it opens. Although it's not part of an automated test, we encourage you to try out your executable with valgrind.
We certainly will when we're grading your work. Any leaked memory, use of uninitialized memory, access to memory outside the
range of an allocated block or leaked files will cost you some points. Valgrind can help you find these errors before we do.
Your program is not expected to free memory or close all its files on error tests, those tests where your program is supposed to
exit unsuccessfully. On these test cases, your program may need to exit from inside some function, where you don't have access
to all the memory you've allocated or all the files you've opened. In these cases, your program can just exit right away, without
having to free all resources.
The compile instructions above include the -g flag when building your program. This will help valgrind give more useful reports of
where it sees errors. To get valgrind to check for memory errors, including leaks, you can run your program like the following.
This example runs the program on input 12, a test input that uses several of the supported commands. You can use similar
commands to try your program on any input using valgrind.
$ valgrind --tool=memcheck --leak-check=full ./movies list-d.txt < input-12.txt
-valgrind output omittedIf
you want to run valgrind, you have to run it on each test case individually. You can't run the test script inside valgrind (well,
you can, but then you will be getting valgrind output for the shell, rather than for the program you wrote). I've seen students try
to run something like "valgrind test.sh". Just be aware that this won't work. You have to run valgrind like the example above.
Test Cases
The tests for your program use the following movie list files. These contain various-sized lists of movies, along with some invalid
movie lists for testing error handling.
1. list-a.txt : This list just just contains one movie.
2. list-b.txt : This list contains 5 movies. This shouldn't require resizing the array of movies maintained by your Database, and
they're already sorted by ID number.
3. list-c.txt : This list contains 28 movies, sorted by ID. With 28 movies, this list should force the resizable array in the
Database to resize three times while reading the list.
4. list-d.txt : This list contains 11 movies that aren't sorted by ID, so you will need to be able to sort them to report the
database in the right order.
5. list-e.txt : This is a larger list, with 10,000 movies.
6. list-f.txt : This list is like list-d.txt, except it has a movie with an ID of 84690, a duplicate ID for a movie on list-c.txt.
7. list-g.txt : This list is just like list-d.txt, except one of the lines is missing some fields.
We've prepared 19 tests for your program. Using the movie lists above, they exercise the various commands your program is
supposed to support, working from the easier ones to the more difficult tests and error cases. In developing your program, you
may want to follow the order of these tests, adding the code to support the first test, then working toward the later (more
complex) tests as you get the earlier ones working.
1. This test reads the list-a.txt file, but it runs the quit command immediately.
2. This test reads the list-b.txt file. It runs the database command then input ends at the end-of-file on the input, rather than
a quit command (which should be OK).
3. This test reads the list-c.txt file, which should cause your database to have to resize its list of movie pointers.
4. This test reads the list-d.txt file and runs the database command, which will require your program to sort movies by ID.
5. This test reads 3 movie lists given on the command line, list-a.txt, list-b.txt and list-c.txt.
6. This test reads movie lists list-c.txt and list-d.txt, then uses the year command to list movies with a year between 1995 and
2010.
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 10/12
7. This test reads movie lists list-c.txt and list-d.txt, then uses the year command to list movies with a year between 1960 and
1970. None of the movies in these lists have a year in that range.
8. This test reads movie lists list-c.txt and list-d.txt, then uses the title command to list movies whose title includes the string
"in".
9. This test reads movie list list-d.txt. It adds a couple of movies to the watch list, showing the list before and after.
10. This test reads movie lists list-a.txt, list-b.txt, list-c.txt and list-d.txt. It adds all the movies in all these lists to the watch
list. That should require the array in the watch list to resize more than once.
11. This test is like the previous one, but, after adding all the movies to the watch list, it removes 10 of them.
12. This test reads movie list list-d.txt. It runs the same commands as the example shown at the start of the project
description.
13. This is a large test. It loads 10000 movies from list-e.txt. It runs the database and list command (with various parameters).
Then it randomly adds and removes some movies from the watch list, showing the list periodically.
14. This is a test for error handling. The command line arguments ask the program to open a movie list file that doesn't exit.
15. This is a test for error handling. The program is run with invalid command-line arguments (no arguments).
16. This is a test for error handling. The standard input includes two invalid commands.
17. This is a test for error handling. It tries to add the same movie to the watch list more than once, and it tries to remove a
movie from the watch list that isn't there.
18. This is a test for error handling. It reads list-c.txt and list-f.txt at startup, giving it two movies with the same ID.
19. This is a test for error handling. It reads list-g.txt, which includes a bad movie description.
Extra Credit Tests
If you do the extra credit part of the assignment, we're providing a couple of test cases you can use to check your program's
behavior. The extra credit inputs are input-ec-1.txt ad input-ec-2.txt. For each of these, we're also providing an expected
output file. You can try them using the following commands to see if your program is working for the extra credit part of the
assignment.
$ ./movies list-d.txt < input-ec-1.txt > output.txt
$ echo $?
0
$ diff output.txt expected-ec-1.txt
$ ./movies list-d.txt < input-ec-2.txt > output.txt
$ echo $?
0
$ diff output.txt expected-ec-2.txt
Grading
The grade for your program will depend mostly on how well it functions. You also get to provide your own Makefile, and we'll
expect your program to compile cleanly, to follow the style guide and to adhere to the expected design. Completing the extra
credit can earn you some additional points.
Compiling cleanly on the common platform: 10 points
Working Makefile: 5 points
Behaves correctly on all tests: 80 points
Program follows the style guide: 20 points
Support for genres and the genre command: 8 extra credit points
Deductions
Up to -60 percent for not following the required design.
Up to -30 percent for failing to submit required files, submitting files with the wrong name or having extraneous files
in the repo.
Up to -30 percent for exhibiting file leaks, memory leaks or other memory errors.
-20 percent penalty for late submission.
Getting Started
To get started on this project, you'll need to clone your NCSU github repo and unpack the given starter into the p4 directory of
your repo. You'll submit by committing files to your repo and pushing the changes back up to the NCSU github.
Clone your Repository
You should have already cloned your assigned NCSU github repo when you were working on project 2. If you haven't already
done this, go back to the assignment for project 2 and follow the instructions for for cloning your repo.
Unpack the starter into your cloned repo
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 11/12
We had to make a correction to the starter after the project was released. There were some non-ASCII characters in some of the
database entries. This has been corrected in the starter files. If you've already started working on the project, you may want to
download the project 4 update, update4.tgz. It just contains the three files that we had to change. You can also download the
update archive using a curl command like the one below; just change starter4.tgz to update4.tgz at the end.
You will need to copy and unpack the project 4 starter. We're providing this file as a compressed tar archive, starter4.tgz. You
can get a copy of the starter by using the link in this document, or you can use the following curl command to download it from
your shell prompt.
$ curl -O https://www.csc2.ncsu.edu/courses/csc230/proj/p4/starter4.tgz
Temporarily, put your copy of the starter in the p4 directory of your cloned repo. Then, you should be able to unpack it with the
following command:
$ tar xzvpf starter4.tgz
Once you start working on the project, be sure you don't accidentally commit the starter to your repo. After you've successfully
unpacked it, you may want to delete the starter from your p4 directory, or move it out of your repo.
$ rm starter4.tgz
Instructions for Submission
If you've set up your repository properly, pushing your changes to your assigned CSC230 repository should be all that's required
for submission. When you're done, we're expecting your repo to contain the following files. You can use the web interface on
github.ncsu.edu to confirm that the right versions of all your files made it.
movies.c : source file for the top-level component, written by you.
database.c : Implementation file for the database component, written by you.
database.h : Header for the database component, written by you.
input.c : Implementation file for the input component, written by you.
input.h : Header for the implementation component, written by you.
Makefile : the project's Makefile, written by you.
list-*.txt : movie lists for the tests, provided with the starter.
input-*.txt : test inputs given to the program on standard input, provided with the starter.
expected-*.txt : expected output from the program, provided with the starter.
estderr-*.txt : expected standard error output for some test cases, provided with the starter.
test.sh : test script, provided with the starter.
.gitignore : a file provided with the starter, to tell git not to track temporary files specific to this project.
Pushing your Changes
To submit your project, you'll need to commit your changes to your cloned repo, then push them to the NCSU github. Project 2
has more detailed instructions for doing this, but I've also summarized them here.
Whenever you create a new file that needs to go into your repo, you need to stage it for the next commit using the add
command:
$ git add some-new-file
Then, before you commit, it's a good idea to check to make sure your index has the right files staged:
$ git status
Once you've added any new files, you can use a command like the following to commit them, along with any changes to files that
were already being tracked:
$ git commit -am "<meaningful message for future self>"
Of course, you haven't really submitted anything until you push your changes up to the NCSU github:
$ git push
Checking Jenkins Feedback
Checking jenkins feedback is similar to previous projects. Visit our Jenkins system at http://go.ncsu.edu/jenkins-csc230 and
you'll see a new build job for project 4. This job polls your repo periodically for changes and rebuilds and tests your project
automatically whenever it sees a change.
Learning Outcomes
The syllabus lists a number of learning outcomes for this course. This assignment is intended to support several of theses:
Write small to medium C programs having several separately-compiled modules
Explain what happens to a program during preprocessing, lexical analysis, parsing, code generation, code optimization,
linking, and execution, and identify errors that occur during each phase. In particular, they will be able to describe the
differences in this process between C and Java.
Correctly identify error messages and warnings from the preprocessor, compiler, and linker, and avoid them.
2021/3/23 CSC230 Project 4
https://www.csc2.ncsu.edu/courses/csc230/proj/p4/p4.html 12/12
Find and eliminate runtime errors using a combination of logic, language understanding, trace printout, and gdb or a similar
command-line debugger.
Interpret and explain data types, conversions between data types, and the possibility of overflow and underflow
Explain, inspect, and implement programs using structures such as enumerated types, unions, and constants and
arithmetic, logical, relational, assignment, and bitwise operators.
Trace and reason about variables and their scope in a single function, across multiple functions, and across multiple
modules.
Allocate and deallocate memory in C programs while avoiding memory leaks and dangling pointers. In particular, they will
be able to implement dynamic arrays and singly-linked lists using allocated memory.
Use the C preprocessor to control tracing of programs, compilation for different systems, and write simple macros.
Write, debug, and modify programs using library utilities, including, but not limited to assert, the math library, the string
library, random number generation, variable number of parameters, standard I/O, and file I/O.
Use simple command-line tools to design, document, debug, and maintain their programs.
Use an automatic packaging tool, such as make or ant, to distribute and maintain software that has multiple compilation
units.
Use a version control tools, such as subversion (svn) or git, to track changes and do parallel development of software.
Distinguish key elements of the syntax (what's legal), semantics (what does it do), and pragmatics (how is it used) of a
programming language.
Describe and demonstrate how to avoid the implications of common programming errors that lead to security vulnerabilities,
such as buffer overflows and injection attacks.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。