Thread Lab: Manipulating a Shared Buffer
1 Overview
In this lab, you will implement a circular buffer, or ring buffer, that enables sharing data between a writer
(producer) and a reader (consumer). There will be a bounded buffer with fixed-length slots shared by the
producer and consumer threads, as shown in the figure below.
Typically, a producer does some amount of work to produce an item to place in the buffer. Similarly,
a consumer would work on an item taken from the buffer. In your program, you will use “sleeping” to
simulate the time delay incurred when producing and consuming items.
This shared data structure is also quite useful to store log data, debugging data, or intermediate results in
a concurrent computation. For example, Linux’s dmesg communicates logging information in the Linux
kernel using a ring/circular buffer.
2 Logistics
This is an individual project. All submissions are electronic. We strongly advise you to test your code on
the CS Linux machine before submitting it.
Before you begin, please take the time to review the course policy on academic integrity at the course
web site.
3 Downloading the assignment
As usual, the lab is available by downloading a zip file to your Linux machine as follows:
$ wget https://bit.ly/3kqkxed -O threadlab-handout.zip
1
Unpacking the file with unzip threadlab-handout.zip will create a subdirectory named “threadlabhandout”
containing the Makefile, skeleton code, and test files.
4 Programming Task
You are to write your solution in sharedbuf.c. There will be two threads: a producer and a consumer.
The producer will read information from standard input (see Section 6) and place it into the shared buffer.
The consumer will extract the data from the buffer and perform certain operations.
The textbook contains a reference implementation of a producer/consumer program using semaphores (see
Section 12.5.4 of CSAPP:3e textbook). Your task is to implement a shared buffer without semaphores.
Instead, you will use POSIX threads (pthreads) mutexes and condition variables. In so doing, you will
learn some of the basics of multi-threaded programming and synchronization.
5 Programming Rules
• Your program MUST be implemented using POSIX threads (pthreads).
• You may NOT use semaphores of any type to implement your solution. This includes implementing
a semaphore construct yourself by building on more primitive thread constructs.
• You may NOT implement a solution that uses any type of polling, regardless of whether or not the
polling wastes the CPU. (In other words, your implementation cannot repeatedly check whether the
buffer is full or empty, then “wait a while” before checking again. If your producer is capable of seeing
two buffer-full conditions in a row without inserting anything in the buffer, or if your consumer can
see two buffer-empty conditions without removing anything, you have implemented polling and must
come up with a different solution.)
• The main program we have provided creates a thread that runs the consumer and then calls the producer’s
procedure directly; this effectively makes the main thread the producer thread. After the
consumer terminates, the main thread collects it with pthread_join and exit. (In an alternate
design, the main program could create and collect two new threads.)
• All library and system calls should be error-checked. If an error occurs, print an informative message
and terminate the program.
6 Provided Code
We have provided a Makefile and starter code for you; you should familiarize yourself with the starter
code and especially the sections marked with TODO comments, which are the places where you will need to
make changes.
The producer and consumer will communicate through a shared buffer that has 10 slots (the size is set by a
#define so that it is easy to change). Each data in the buffer has the following structure:
2
struct data {
int value; /* Value to be passed to consumer */
int consumer_sleep; /* Time (in ms) for consumer to sleep */
int line; /* Line number in input file */
int print_code; /* Output code; see below */
int quit; /* Nonzero if consumer should exit ("value" ignored) */
}
These fields have the following purposes:
value The actual data to be passed to the consumer; in our implementation the consumer will sum the values
passed in.
consumer_sleep A time (expressed in milliseconds) that the consumer will expend in consuming the
buffer entry.
line The line number in the input file that this data came from. Line numbers start at 1, not zero.
print_code A code indicating whether the consumer or producer should print a status report after consuming
or producing this line.
quit For all buffer entries except the last, this value should be zero. For the last entry, it should be nonzero.
The consumer should not look at any of the other fields in the data struct if quit is nonzero.
Besides the shared buffer itself, you will need a number of auxiliary variables to keep track of the buffer status. These
might include things such as the index of the next slot to be filled or emptied. You will also need some pthread
conditions and mutexes. The exact configuration is up to you, but you should make sure that the entire shared buffer
can be used.
6.1 Compiling and Testing
We have provided a sample Makefile that will help compile your program using the make command. Pay attention
to compiler warnings—and fix them!
To test your program, compile and run it with the standard input redirected from a test file and, perhaps, the standard
output redirected as well. Here is an example:
$ make
$ ./sharedbuf < tests/test-input-1.txt > my-output-1.txt
6.2 Input Files
The handout includes five test files for your experiments. Because of indeterminacies in the system, a file may produce
different results from run to run. Nevertheless, the total sum should be 42 in all cases. The input files are:
test-input-0.txt A small test case with no sleeping. Note that because of indeterminacy in the system
scheduler, this test file may produce different results from run to run. However, only this and test-input-4.txt
will ensure that you are interpreting print_code correctly.
3
test-input-1.txt The test case from test-input-0.txt, with 1-second sleeps for the producer
and no sleeping in the consumer. We recommend that you begin testing with this file, because it generates
results that are easy to interpret. Also, be sure that it produces exactly one pair of output lines per second.
If the output comes along too quickly, or if your program appears to hang, you may be calling nanosleep
incorrectly. (Hint: if you run this command:
$ time ./sharedbuf < test-input-1.txt
it should report a “real” time of about 25 seconds.)
test-input-2.txt The test case from test-input-0.txt, with 1-second sleeps for the consumer
and no sleeping in the producer. This file tests your ability to deal with situations where the producer runs far
ahead of the consumer, so that the buffer is always full.
test-input-3.txt A test case with randomly generated sleep times. At times, the producer will run
ahead; at other times the consumer will catch up.
test-input-4.txt Another test case with randomly generated sleep times, and also with randomly
generated print_codes.
6.3 The Producer
The basic task of the producer is to read one line at a time from the standard input. For each line, it will first sleep
for a time given in the line, then pass the data to the consumer via the shared buffer. Finally, after the data has been
placed in the shared buffer, the producer will optionally print a status message. Printing is slow, so the producer must
not hold any mutexes while it is printing.
Each input line consists of four numbers, separated by spaces, as follows:
• The value to be passed to the consumer.
• An amount of time the producer should sleep, given in milliseconds. Note that the sleep must be done before
placing data in the shared buffer.
• An amount of time the consumer should sleep, given in milliseconds.
• A “print code” integer that indicates what sorts of status lines should be printed.
The producer reads these four numbers using the C library function scanf (see man scanf for more information.
Feel free to do a Google search for examples of using scanf). When scanf fails, either by returning the value EOF
or by returning fewer the four values, the producer should enter one more data in the shared buffer, without sleeping
first. The data should contain a nonzero quit field; the other fields will be ignored by the consumer.
The “print codes” are interpreted as follows:
0 No messages are printed for this input line.
1 The producer generates a status message.
2 The consumer generates a status message.
3 Both the producer and consumer generate status messages.
4
The producer’s status message should be generated after the data has been passed to the consumer via the shared
buffer. It must print the value and line number from the input line, and be produced by calling printf with exactly
the following format argument:
"Produced %d from input line %d\n"
6.4 The Consumer
The consumer waits for messages to appear in the buffer, extracts them, and then processes them (the “processing”
just involves adding the value to a running sum). Note that the consumer does not act on the data until after it has been
removed from the buffer, so that the producer can continue to work while the consumer is processing the data.
If the extracted data has a nonzero quit field, the consumer does not add the value to the sum, but simply prints the
total it has calculated, using the following printf format:
"Final sum is %d\n"
It then terminates its thread without sleeping.
Otherwise, when quit is zero, the consumer then sleeps for the specified time and then adds the value field to a
running total (initialized to zero), and finally optionally prints a status message if the print_code is 2 or 3. The
status message must be generated by calling printf with exactly the following format argument:
"Consumed %d from input line %d; sum = %d\n"
7 Hints
Below you can find some hints and suggestions for working on this assignment.
7.1 Man pages
You will need to use a number of Unix system and C library calls. You can read the documentation on these calls by
using man. For example, to learn about pthread_mutex_lock, type man pthread mutex lock. (Sometimes
you need to specify the section of the manual you want, for example with man 3 printf, but usually that’s
not needed. The calls you will need to use are all documented in sections 2 and 3 of the manual.) If some manual
pages are not available on your machine, you can usually find them online using a Google search.
You should try to develop familiarity with the style of Unix manual pages. For example, many man pages have a “SEE
ALSO” section at the bottom, which will lead you to useful related information.
NOTE: Some of the pthread manual pages are taken directly from the POSIX standard, which is written in a style
that no rational human can be expected to understand. If you encounter a manual page that makes heavy use of the
word “shall” and is otherwise incomprehensible, we suggest that you search the Internet for a clearer discussion. For
example, you may consider searching for “pthread cond wait example”.
7.2 Standard Output
To make sure you get the best grade even if there are bugs in your solution, we suggest that you include the following
line at the top of your main function:
5
setlinebuf(stdout);
Doing so will ensure that when your program is run with standard output redirected to a file, any partial output will
appear even if your program hangs. Speaking of that, you should test your program with redirected output; that
changes the timing and reveals some bugs that won’t appear if you only test with output to the terminal.
7.3 Pthreads Constructs
You will need to familiarize yourself with the following pthreads functions, at a minimum:
• pthread_create
• pthread_join
• pthread_mutex_lock
• pthread_mutex_unlock
• pthread_cond_wait
• pthread_cond_signal
You may choose to use other functions as well. Remember that you are NOT allowed to use the pthreads semaphore
functions (sem_*).
7.4 Sleeping
For historical reasons, there are many ways to get a thread to go to sleep for a specified time period. The preferred
method is nanosleep; see “man 2 nanosleep” for documentation. Note that if the sleep time exceeds 999 milliseconds,
you cannot simply convert milliseconds to nanoseconds because nanosleep requires that the nanoseconds
field be less than 10. Also note that you will not need the second argument to nanosleep; you can set it to NULL.
We have provided a wrapper function named thread_sleep, which accepts an argument in milliseconds and converts
it into a correct nanosleep call. However, note that as given, the wrapper function DOES NOT WORK
CORRECTLY: it always sleeps for 0.25 seconds. You will have to modify it to calculate a correct value. Note that if
the specified sleep time is zero, the wrapper function does not call nanosleep.
8 Grading
The lab is graded based on your program’s behavior when run against the five provided test inputs. For each input,
there are 10 points available for matching the expected sum. In addition, for test-input-1.txt and
test-input-2.txt, there are an additional 10 points if your output exactly matches our sample output for those
two cases. (The other three test cases have nondeterministic output, so we don’t expect an exact match there.)
WARNING: When we test your code, we will run some background processes in another window to encourage the
OS to schedule your threads in a less deterministic order. We suggest that you do the same so that some of your race
conditions can be uncovered.
6
9 Sample Output
The following is the result of running our sample solution on the test case test-input-4.txt (note that your
interleaving of “Produced” and “Consumed” may differ since this test input has randomness built in):
Produced -8 from input line 2
Consumed 3 from input line 1; sum = 3
Produced 1 from input line 3
Produced 10 from input line 4
Consumed 1 from input line 3; sum = -4
Consumed 4 from input line 5; sum = 10
Produced 0 from input line 6
Consumed 0 from input line 6; sum = 10
Produced -1 from input line 8
Consumed -1 from input line 8; sum = 3
Consumed 8 from input line 9; sum = 11
Consumed 5 from input line 12; sum = 20
Produced 10 from input line 14
Consumed 1 from input line 15; sum = 40
Produced 10 from input line 16
Produced 5 from input line 17
Produced -2 from input line 20
Produced 1 from input line 21
Consumed -2 from input line 20; sum = 48
Consumed 9 from input line 23; sum = 53
Consumed 3 from input line 24; sum = 56
Produced 6 from input line 26
Consumed 6 from input line 26; sum = 55
Produced -3 from input line 27
Produced -8 from input line 32
Consumed -4 from input line 30; sum = 47
Consumed -7 from input line 31; sum = 40
Consumed -8 from input line 32; sum = 32
Consumed -4 from input line 34; sum = 34
Produced -7 from input line 36
Produced -1 from input line 39
Consumed 3 from input line 40; sum = 45
Consumed 1 from input line 41; sum = 46
Produced 10 from input line 42
Produced 0 from input line 43
Consumed -2 from input line 44; sum = 54
Produced -8 from input line 46
Consumed -8 from input line 46; sum = 39
Produced 1 from input line 49
Consumed -8 from input line 47; sum = 31
Consumed -1 from input line 48; sum = 30
Consumed 1 from input line 49; sum = 31
Consumed 11 from input line 50; sum = 42
Final sum is 42
7
10 Submitting Your Work
You need to submit your sharedbuf.c file using Gradescope. You may resubmit your work until the deadline, with
your most recent submission counting for credit.
11 Additional Tips
Below is a more detailed description of the major functionalities that need to be implemented. You should attack these
in the following order:
Shared Buffer The shared buffer is a data structure that holds data that is transferred between the Producer and the
Consumer. The buffer is a “shared” medium that has only “one” copy. This means there is only copy that
comes into existence at the start of execution. This copy is shared by all the routines. The other aspect of the
buffer is the fact that there are two indexes. The input index points to the next cell to be filled; The output index
points to the next cell to be emptied. Using modulo arithmetic, you need to keep track of these pointers. You
also need to know when the buffer is empty or full.
Producer The Producer is a function that reads from the input file and stores data in the buffer. A good way to build
the shared buffer lab is to first write a Producer like function that reads the input file and fills in the shared
buffer. You might have this function loop reading input, loading the next shared buffer position, and printing
the whole shared buffer out. You could stop at full or you could go past full, testing the modulo arithmetic
applied to the index. Once you have this function, you know that the shared buffer works and that you can load
and print the buffer. Exactly how to read the last input message is tricky. You should think about how this needs
to work before worrying about threads.
Consumer The Consumer is a function that reads each entry in the shared buffer and prints out the value. Creating
this function after the Producer lets you test your ability to access the shared buffer. To test the Consumer,
you could run the Producer for X entries, and then call the Consumer. Given that from the above steps you
have built a static shared buffer, a Producer that fills the shared buffer, and a Consumer that empties the shared
buffer, the movement to threads is the next step (really not that difficult). Besides creating threads, you need to
understand functions that provide mutual exclusion to the shared buffer, that wait for shared buffer conditions
(an empty position, a full position, etc.). The key is to read the man pages and/or Google. Thread creation is
straight forward. What is not is condition variables and checking for them. Make sure you understand that a
mutex needs to be set before checking for a condition.
Sleeping The Sleep function is another area of concern. The handout gives some details, but there may be issues
you need to resolve. First, your function should allocate a local copy of the appropriate structure, then pass a
pointer to that structure when calling nanosleep. Second, you do not need a malloc for the timespec
structure. Just allocate a local copy.
8
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。