Coursework: High-Performance Computing
Module: Introduction to HPC (PHYS 52015)
Term: Michaelmas Term, 2023
Submission Please submit a zip archive containing two PDF files (part1.pdf & part2.pdf)
and two code files (part1.c & part2.c).
Deadlines Consult the MISCADA learning and teaching handbook for submission deadlines.
Plagiarism and collusion Students suspected of plagiarism, either of published work or
work from unpublished sources, including the work of other students, or of collusion will be
dealt with according to Computer Science and University guidelines.
Coursework Description
Throughout the course we have considered simple programming problems largely distinct from
the typical day-to-day practice of scientific computing. In this assignment you will experience
a sliver of that practice by inheriting a codebase it is your responsibility to parallelize while
maintaining correctness.
Consider the following partial differential equation (PDE), which is a variant of the FitzHughNagumo model,
u˙ = ∇2u + u(1 − u)(u − b) − v, nˆ · ∇u = 0,
v˙ = d∇2
v + c(au − v), nˆ · ∇v = 0.
This is an example of a reaction-diffusion system — the reaction is the non-differential term
on the right-hand-side, and the diffusion is the first term on the right-hand-side. Both fields
are subject to ‘no-flux’ boundary conditions, where their normal derivative at the boundary
is identically zero. These models produce a wide array of patterns, from cardiac arrhythmias
to spots and stripes like those seen on animal coats. See here for an interactive example of
this particular model.
You will be supplied with a two-dimensional reaction-diffusion code serial.c which reads
parameters from a header file params.h, simulates the above equations, and tracks the solution
norms,
wu(t) = (δx · δy)
X
(i,j)
u(t, xi
, yj )
2
, wv(t) = (δx · δy)
X
(i,j)
v(t, xi
, yj )
2
, (1)
over time, where (i, j) ∈ [0, N)
2
range over the indices of the array.
Your assignment will be to parallelize the code using OpenMP and MPI, and to explain your
decisions with theoretically sound arguments and measurements of performance.
1
Implementation Notes
• You should preserve the model parameter values and not modify params.h – only modify
the provided serial.c.
• The boundary conditions are folded into the evaluation of the diffusion term — when
i+/-1 or j+/-1 exceeds the range of u or v, then the code just mirrors these ‘ghost
points’ across the boundary back into the domain, e.g., u[-1] = u[0].
1 Your implementation should retain this behavior.
• For scaling results you should measure the executable time, rather than the time for
any subsection of the program, e.g. using the unix command time or appropriate timing
constructs covered in the course.
Part 1: OpenMP
In this assessment, you will compile and run a serial two-dimensional reaction-diffusion code,
and compare its performance against a parallelized version that you will write. The serial
code is made of five functions, init, dxdt, step, norm, and main. The expectations for your
parallel implementation are to use OpenMP #pragmas to:
• Parallelise the function init.
• Parallelise the function dxdt.
• Parallelise the function step.
• Parallelise the function norm.
Your code should be in a single C file named part1.c. Your code must compile and run
with the provided submission script, and produce the same outputs as the serial code in a file
named part1.dat.
Report
Explain and justify your parallelization strategy, using arguments based in theory covered in
the course and your scaling results. Investigate the strong scaling of your implementation.
Report scaling results using transferable metrics in your report. Additional questions you
may wish to consider in your report are listed below. Your report should be no more than
one (1) page (plus images), in a file named part1.pdf.
Questions to consider: What options for parallelisation are available? Why are some more
suitable than others? What difficulties arise in the parallelisation? Where are the necessary
synchronisation points? The solution norm requires the generation of a single output number
from an N-by-N array; what patterns are available for this function? How did you avoid data
races in your solution? Is your parallelisation approach the best option? What alternative
approaches could be used?
Part 2: MPI
In this assessment, return to the serial implementation of the two-dimensional reaction diffusion system, and parallelize the code using MPI calls, breaking down the original problem
domain into distinct regions on each process. Your implementation should:
• Reproduce the initialization of u and v across processes to match the serial code.
1See the exercise on the heat equation for a reference.
2
• Correctly exchange necessary information of u and v across processes.
• Correctly calculate the norms of u and v across all ranks.
• Correctly evaluate the diffusion term on all ranks.
Your code should be a single C file called part2.c. Your code should compile and run with
the provided submission script (using 4 MPI processes), and produce the same outputs as the
serial code in a file named part2.dat.
Report
Explain and justify your parallelization strategy, using arguments based in theory covered in
the course and your scaling results. Investigate the weak scaling of your implementation.
Report scaling results using transferable metrics in your report. Additional questions you
may wish to consider in your report are listed below. Your report should be no more than
one (1) page (plus images), in a file named part2.pdf.
Questions to consider: What topologies for distribution are available with 4 MPI processes? Why might some be preferred over others? What difficulties arise in the parallelisation? The solution norm requires the generation of a single output number from an large
distributed array — what patterns are available for this problem? What if we assume that u
and/or v change slowly compared to the time-step — do any further optimizations for data
exchanges become available? What are some constraints on the possible domain sizes and
number of MPI processes for your solution?
Marking
Each part of your submission will be considered holistically, e.g. your code and report for Part
1 will be considered in tandem so that discrepancies between them will affect your marks.
Your code will be run for correctness on Hamilton. If you develop your programs on your
own machine, then you should test that it works on Hamilton with the provided submission
scripts.
Submission Points Description
All code 10 Compiles and runs to completion without errors using the provided submission scripts.
part1.c 30 Correct parallelization of the serial reaction-diffusion code using
OpenMP, producing correct outputs.
part1.pdf 20 Description and justification of parallelisation scheme, and inclusion of transferable strong scaling results.
part2.c 20 Correct parallelization of the serial reaction-diffusion model using
MPI, producing correct outputs.
part2.pdf 20 Description and justification of parallelisation scheme, and inclusion of transferable weak scaling results.
Table 1: Marking rubric for the summative coursework. Please see the report marking criteria
in the Appendix.
Submission format
Your submission should be a single zip file, uploaded to gradescope, containing part1.c,
part2.c, part1.pdf, and part2.pdf.
3
Appendix
Generic coursework remarks
Stick exactly to the submission format as specified. If you alter the format (submit an archive
instead of plain files, use Word documents rather than PDFs, . . . ), the marker may refuse
to mark the whole submission. Markers will not ask for missing files. If you have to submit
code, ensure that this code does compile and, unless specified otherwise, does not require any
manual interaction. Notably, markers will not debug your code, change parameters, or assess
lines that are commented out.
All of MISCADA’s deadlines are hard deadlines: In accordance with University procedures,
submissions that are up to 5 working days late will be subject to a cap of the module pass
mark. Later submissions will receive a mark of zero. If you require an extension, please submit
an official extension request including medical evidence and/or acknowledgement by college.
Do not contact the lecturers directly, as lecturers are not entitled to grant extensions. Details
on extensions and valid reasons to grant extended deadlines can be found in the Learning
and Teaching Handbook.
It is the responsibility of the student to ensure that there are sufficient backups of their
work and that coursework is submitted with sufficient slack. Submit your coursework ahead
of time. If in doubt, submit early versions. Technical difficulties (slow internet connection
around submission deadline, lost computer hardware, accidentially deleted files, . . . ) will
not be mitigated. Please see https://www.dur.ac.uk/learningandteaching.handbook/
6/2/6/ for further information regarding illness and adverse circumstances affecting your
academic performance.
If collusion or plagiarism are detected, both students who copy and students who help to
copy can be penalised. Do not share any coursework with other students, do not assist
other students, cite all used sources incl. figures, code snippets, equations, . . . Please see
https://www.dur.ac.uk/learningandteaching.handbook/6/2/4 and https://www.dur.
ac.uk/learningandteaching.handbook/6/2/4/1 for further information.
Coursework is to be treated as private and confidential. Do not publish the whole or parts
of the coursework publicly. This includes both solutions and the plain coursework as handed
out.
Generic report quality criteria
Where summative coursework is assessed through written work in the form of a report, the
report will be assessed against some generic criteria.
The relevant grade bands (as percentages) are
0–49 Fail
50–59 Pass
60–69 Merit
70–79 Distinction
80–100 Outstanding
A fail-level report displays an unsatisfactory knowledge and understanding of the topic. The
setup and evaluation of any experimental studies is incomplete. It contains many omissions
or factual inaccuracies. Limited in scope and shows little or no evidence of critical thinking
and application of the course material to the problem. No recognition of limitations of the
approach or evaluation. Experimental data are generally presented incorrectly, or without
clarity.
4
A pass-level report displays some knowledge and understanding of the topic. The setup
and evaluation of any experimental studies is competent. May contain some omissions or
factual inaccuracies. Evidence of critical thinking and application of the course material to
the problem occurs in some places. Has some recognition of limitations of the approach or
evaluation. Most experimental data are presented correctly and clearly.
A merit-level report displays good knowledge and understanding of the topic as presented
in the course material. The setup and evaluation of any experimental studies is careful and
detailed. Broadly complete in scope, with few or no errors. Evidence of critical thinking
and application of the course material to the problem is mostly clear throughout. Recognises
limitations of the approach or evaluation, and has some discussion on how to overcome them.
Experimental data are presented correctly and clearly.
A distinction-level report displays effectively complete knowledge and understanding of the
topic. The setup and evaluation of any experimental studies is well-motivated and nearflawless. Effectively no errors. Evidence of critical thinking and application of the course
material to the problem is clear throughout, and some of the discussion goes beyond the
taught material. Recognises limitations of the approach or evaluation, and presents plausible
approaches to overcome them. Experimental data are presented carefully and with attention
to detail throughout.
An outstanding-level report is similar to a distinction-level report but is effectively flawless
throughout, and shows a significant independent intellectual contribution going beyond the
taught material.
5
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。