联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2021-03-15 11:24

New York University Abu Dhabi

CS-UH 3010

Programming Assignment 2

Due: March 8th, 2021

Preamble:

In this assignment, you will write a program that creates a hierarchy of processes using fork() and uses the

exec*() system call(s) to have nodes accomplish potentially diverse tasks.

The initial program along with all its created offsprings will collectively provide a mechanism that carries

out the sorting of a (arbitrarily-long) file of (say taxpayer) records based on a designated attribute. This is

to be accomplished in a divide-and-conquer style: a number of nodes (processes) are expected each to sort a

portion of the records in the provided file and another node (process) will perform the assembly of the final

result.

The nodes of the hierarchy that undertake the sorting of the records execute a specific user-provided program

(executable) whose objective is to produce the records in either ascending or descending order for the

designated attribute. Processes in the hierarchy may communicate among themselves using either signals

or pipes to pass information about events and/or results. It is highly desirable that the use of blocking pipes

be avoided as much as possible [Ker10, Del21].

Overall in this project, you will:

❼ create a hierarchy of processes by invoking fork() multiple times (as needed),

❼ allow the execution of different piece(s) of code by (some) nodes in the hierarchy by invoking exec*()

calls,

❼ use a number of useful system calls including fork(), exec*(), read(), write(), wait(), getpid(),

getppid(), pipe(), dup(), dup2(), poll(), select(), mkfifo(), etc.

Procedural Matters:

 Your program is to be written in C/C++ and must run on NYUAD’s Ubuntu server bled.abudhabi.nyu.edu.

 You have to first submit your project via newclasses.nyu.edu and subsequently, demonstrate your work.

 Dena Ahmed (daa4-AT+nyu.edu) will be responsible for answering questions as well as reviewing and

marking the assignment in coordination and collaboration with the instructor.

Project Description:

Figure 1 depicts a sample process hierarchy your program may generate. The overall goal of the hierarchy

is to create a sorted listing of all data-records based on a user-designated attribute.

Each data-record consists of a number of lexemes offering information for an individual taxpayer including

resident-id, first name, last name, number of dependents, income and postal code1

. Data-records are provided

to the your program as a data file in ASCII-text format.

In the process hierarchy, there are 4 types of nodes that carry out distinct jobs. More specifically:

1. root node: this is your program termed myhie. Although it functions as the anchor for the entire

hierarchy, more importantly, it orchestrates the collective sorting task. At first, it creates a single

coordinator node and the root passes to this node –named coord– the file containing the data-records

to be sorted, the (variable) number of k workers to be used for the work, the numeric attribute on

which sorting should be carried out (i.e., 0, 3 4 or 5), the executable program to be used for sorting

data-records and anything else it might be deemed necessary for the successful completion of the job

at hand.

The root node also receives SIGUSR1/SIGUSR2 signals dispatched by other processes in the hierarchy

and prints the numbers of the signals caught just before the root terminates its operation.

1

these 6 are also called attributes

sorter0 sorter1 sorter2 sorter3

sorter

k-1 sorter4

merger

root

coord

.........

USR1 USR1

USR2

USR1

USR1 USR1

USR1

Figure 1: The process hierarchy that accomplishes sorting

2. coordinator node: its purpose is to split up the sorting job among k sorters (or workers). The coord

passes to each sorter the name of the file, the “range” of data-records on which the worker is going to

work on, as well as the sorting domain (i.e., which attribute to use for sorting). More importantly, it

provides the name of the executable to be used to carry out the sorting. The “range” can be either

the same for all sorters or it can be randomly set for each sorter (provided that the sum of all ranges

is equal to the total number of data-records we attempt to sort).

3. sorting nodes: to achieve their goal, sorters deploy 2 programs of your choice. For instance if you selects

Heap-Sort (HS) and Bubble-Sort (BS) then odd-indexed workers use the HS and even-indexed workers

the BS program. You may elect to implement any 2 sorting algorithms that you find interesting that

can function as independent programs and for which you will have to write the code. For simplicity,

you can implement your 2 sorting programs assuming the sorting occurs only on valid numeric fields

only (i.e., not alphanumeric).

As indicated earlier, each sorter is furnished with a set of data-records (possibly designated in the

form of a file name/descriptor along with the range of records in the file) and any additional parameter

deemed required for its work.

Every sorter does pass the result of its labor to the merger node through a (named) pipe [Ker10, Lov07].

Sorters do pass on their timing statistics to the merger node and before they can complete their work

send a SIGUSR1 to the root.

4. merger node: this single node receives (or reads) ordered partial results from all workers through pipes

and creates the outcome of the entire sorting process. This node can be coord–created and as soon as

it generates its outcome to standard output, it

– prints out statistics that consisting of the time taken for each of the workers to complete their

individual work, and

– sends a signal –say a SIGUSR2– to the coord pointing out that the entire workflow is all but

completed. As soon as a signal SIGUSR2 is caught by coord, it signifies the end of the hierarchy

and the graceful release of any acquired resources.

The main benefit of the aforementioned approach is that sorting may proceed asynchronously for all individual

segments of data and thus, there is an opportunity to run matters faster. All the hierarchy processes (Figure 1)

are to run concurrently and progress should occur in an asynchronous manner.

2

Any time there is a need to create a new offspring(s) in the hierarchy a fork() has to be apparently involved.

If there is need to replace the address space of these new processes with other executables an exec*() of your

choice should be invoked along with the proper parameter list (that you have to come up with).

How your application should be invoked:

Your program could be invoked as follows:

./myhie -i InputFile -k NumOfWorkers -r -a AttributeNumber -o Order -s OutputFile

where ./myhie is the name of your (executable) program, InputFile the file name containing the datarecords,

NumOfWorkers is the number of sorters to be spawned, r is the flag that instructs the program to

have workers work on “random ranges” or not-equally-sized batches of data-records, AttributeNumber is a

valid numeric-id that designates the field on which sorting is to be carried out, Order is either ascending

(a) or descending (d), and OutFile is the file in which the outcome of myhie could be saved. There is no

pre-determined order with which the above flags can be entered in the command line.

Assume that data files consist of taxpayer records with each record (or line in a text file) featuring 5

attributes: resident-ID, first-name, last-name, income, number-of-dependents, and postal-code. The input

file consists of data-records (one-per-line) and its format is as follows:

222334444 Kenan Barbieri 3 120000 20742

333445555 Kathy McAllister 1 145000 11201

111223333 Dema Allaster 5 87000 20654

999883333 Lise Williams 2 91560 20742

....... ...

In the command-line and for this specific file the AttributeNumber may take value 0, 3, 4, or 5 designating

corresponding numeric fields in the data-records.

Finally, your program root) has to report:

– the time each sorter took to sort its batch of records, provide the length of time that the merger took

to complete its work and furnish the turnaround time required for the entire sorting task to complete,

– the number of SIGUSR1 and SIGUSR2 signals the root has caught (or “seen”) just before the the end

of its execution.

What you Need to Submit:

1. A directory that contains all your work including source, header, Makefile, a readme file, etc.

2. A short write-up about the design choices you have taken in order to design your program(s); 1-2 pages

in ASCII-text would be more than enough.

3. All the above should be submitted in the form of “flat” tar or zip file bearing your name (for instance

YaserFarhat-Proj2.tar).

4. Submit the above tar/zip-ball using NYUclasses.

Grading Scheme:

Miscellaneous & Noteworthy Points:

1. You have to use separate compilation in the development of your program.

2. If you decide to use C++ instead of plain C, you should not use STL/templates.

3. Although it is understood that you may exchange ideas on how to make things work and seek advice

from fellow students, sharing of code is not allowed.

3

Aspect of Programming Assignment Marked Percentage of Grade (0–100)

Quality in Code Organization & Modularity 35%

Correct Execution for Queries 20%

Addressing All Requirements 30%

Use of Makefile & Separate Compilation 7%

Well Commented Code 8%

4. If you use code that is not your own, you will have to provide appropriate citation (i.e., explicitly state

where you found the code). Otherwise, plagiarism questions may ensue. Regardless, you have to fully

understand what such pieces of code do and how they work.

5. The project is to be done individually as the syllabus indicates and should run on the Linux server:

bled.abudhabi.nyu.edu

6. You can access the above server through ssh using port 4410; for example at prompt, you can issue:

“ssh yourNetID@bled.abudhabi.nyu.edu -p 4410”

7. There is ongoing Unix support through the Unix Lab Saadiyat [Uni21].

Timing in Linux: an example

#include <stdio.h> /* printf() */

#include <sys/times.h> /* times() */

#include <unistd.h> /* sysconf() */

int main( void ) {

double t1, t2, cpu_time;

struct tms tb1, tb2;

double ticspersec;

int i, sum = 0;

ticspersec = (double) sysconf(_SC_CLK_TCK);

t1 = (double) times(&tb1);

for (i = 0; i < 100000000; i++)

sum += i;

t2 = (double) times(&tb2);

cpu_time = (double) ((tb2.tms_utime + tb2.tms_stime) -

(tb1.tms_utime + tb1.tms_stime));

printf("Run time was %lf sec (REAL time) although

we used the CPU for %lf sec (CPU time).\n",

(t2 - t1) / ticspersec, cpu_time / ticspersec);

}

References

[Del21] A. Delis. www.alexdelis.eu/abudhabi. Known Credentials, 2021.

[Ker10] M. Kerrisk. The Linux Programming Interface: A Linux and UNIX System Programming. No Starch

Press, San Farnsisco, CA, 2010.

[Lov07] R. Love. Linux System Programming. O’Reilly, Sebastopol, CA, 2007.

[Uni21] NYUAD UnixLab. https://unixlabnyuad.github.io/. With Active Virtual Zoom Session, 2021.

4


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp