联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2024-03-18 09:20

– mkdir build

– cd build

– cmake ..

– make -j

– make install

You must follow these instructions to build and install using CMake, otherwise CMake will

not be able to find FFTW3 when you try to build your project.

Reporting Errors

Contact the lecturer directly if you believe that there is a problem with the starting code, or the build process

in general. To make error reporting useful, you should include in your description of the error: error messages,

operating system version, compiler version, and an exact sequence of steps to reproduce the error. Questions

regarding the clarity of the instructions are allowed. Any corrections or clarifications made after the initial

release of this coursework will be written in the errata section and announced.

Assignment: A Universe in a Box

Important: Read all these instructions before starting programming. In particular, some marks are awarded

for a reasonable git history (as described in Part 3), that should show what you were thinking as you

developed.

0.2 Overview: N-body Simulations and Particle-Mesh Methods

This assignment will ask you to produce a simulation of particles subject to gravitational forces. This is

known as an “n-body simulation”. Please read through this background section carefully, even if you are

familiar with the subject area, as this contains the models that you will have to implement and introduces

notation that will be used throughout the assignment text.

N.B. In order to keep diagrams clear, illustrations will use a 2D grid. You should keep in mind however that

this will be a three dimensional simulation.

Don’t worry about the underlying physics, we’ll describe the mathematical process that needs

to be implemented in detail.

The basic idea of an gravitational n-body simulation is that:

? We have some space in which there are many particles

? We start with some intitial distribution of particles; in our case this they will be uniform randomly

distributed in a box (a unit cube).

? We have np particles.

? Every particle has the same mass, mp.

? Particles have individual positions and velocities.

? The time evolution of the system is broken up into equal time steps.

? We perform the following process in a loop, starting at t = 0 and ending at t = tmax:

– Calculate the acceleration of all particles due to gravity

– Update the position and velocity of all particles

– Increase the time t by the time-step t.

Furthermore, we’ll choose our coordinate system so that:

? Particles exist in a box, with sides of size W.

? Coordinates ?r in the box will be relative to the size of the box, so particle positions will always

been within the unit cube: x, y, z ? [0, 1).

– The actual distance between two particles is therefore W?r

4

– This allows us to model an expanding universe simply by updating the value of W, without

updating the positions of each particle.

– Particles with a fixed position as W increases are moving apart from each other in line with the

overall expansion of the universe; this is why these coordinates are sometimes known as co-moving

coordinates.

– We only keep track of what’s called the peculiar velocity of each particle: the velocity the particle

has when the expansion is ignored.

We will label particles from 0 to (np≠1); every particle has its own position (?ri), velocity (?vi), and acceleration

(?ai). Position, velocity, and acceleration for a given particle are all three dimensional vectors with an x, y,

and z component.

5

0.3 Notation

We will define notation as we introduce it throughout this assignment, but the notation used is summarised

here for reference.

6

0.3.1 Mathematical notation:

? ?r is used for a point/vector in real space coordinates

? x, y, z are the x, y, z components of the ?r vector

? ?

k is used for a point/vector in frequency space coordinates.

? kx, ky, kz are the x, y, z components of the ?

k vector

? k2 is used for ?

k · ?

k

? f(?r) is a function in real space coordinates, while ?f(?

k) is the frequency space representation.

? FT(f) is used for the Fourier transform from real space (?r) to frequency space (?

k).

– ?f(?

k) = FT(f(?r))

? FT≠1( ?f) is used for the inverse Fourier transform from frequency space (?

k) to real space (?r)

– f(?r) = FT≠1( ?f(?

k))

0.3.2 Physical functions:

? fl(?r): the density

? (?r): the gravitational potential

? ?g(?r): the gravitational field; this is the acceleration felt by a particle at ?r

0.3.3 Simulation properties:

? W: width of the box

? nc: the size of each dimension of the mesh in cells

– The mesh is an nc ? nc ? nc grid

? Nc: the total number of cells in the mesh, Nc = n3

c

? wc: the width of a cell

? (i, j, k) is used for indexing the mesh with separate x, y, and z indices (i, j, and k respectively)

? μ is used for indexing the mesh with a single index

– μ = k + nc(j + nci)

? np: the number of particles

? mp: the mass of a particle

? ?ri: the position of particle i

– xi: the x-position of particle i

– yi: the y-position of particle i

– zi: the z-position of particle i

– xi, yi, zi ? [0, 1)

? ?vi: the velocity of particle i

? ?ai: the acceleration of particle i

? t: the time coordinate (begins at 0)

? t: the time-step, the time dierence between one simulation step and the next

? tmax: the finish time of the simulation

? F: the expansion factor per timestep.

0.4 Updating the particles

We can update a particle’s velocity using its current acceleration, and we can update a particle’s position by

using its velocity. We will take a very simple approach to this update known as Euler integration.

?vi(t) = ?vi(t ≠ t) + ?ai(t)t

?xi(t) = ?xi(t ≠ t) + ?vi(t)t

Note that this formulation has implied an ordering: we update the velocity first, and then update the position

using the new velocity values. This order is specified to avoid ambiguity in the implementation rather than

for theoretical reasons.

7

This method is not the most accurate, but it is fast and simple to implement. In order to perform this update

we need to be able to calculate the acceleration at a given time step for every particle, which is dependent on

the relative positions of all of the particles.

We will treat the space as periodic, so that if a particle moves outside of the box it reappears on the other

side.

0.5 Calculating Accelerations: Particle-Particle and Particle-Mesh methods

0.5.1 Particle-Particle

The simplest approach is to calculate the acceleration on each particle due to every other particle directly. If

there are np particles, labeled 0...(np ≠ 1), then the acceleration experienced by particle i due to all other

particles is:

?ai = ≠q

j”=i

mp

|?rj≠?ri|2 r?ji,

where r?ji is the unit vector pointing to particle j from particle i, and mp is the mass of a particle. (We have

also ignored a factor of Newton’s constant, G, which we can set to 1 by a choice of units. This keeps the

mathematical structure as bare as possible.)

This calculation would need to be done for all n particles. This approach is known as the Particle-Particle

(PP) method, since it calculates forces between every pair of particles directly. There are unfortunately a

few problems with this approach:

1. The acceleration function is singular where rij = 0, and thus we can get extreme behaviour (or

even inf/nan issues) when particles are very close. This would need to mitigated by so called “force

softening”.

2. This algorithm rapidly becomes very slow as the number of particles increases (O(n2) due to pair-wise

approach).

3. We would like to simulate an infinite universe, so how do we calculate interactions more distant than

the size of the box?

0.5.2 Particle-Mesh

Another approach, which we shall take in this project, is to use a Particle-Mesh (PM) method. ParticleMesh is so called because it still has particles in a space, but now we also keep track of discretised functions

on the space, which are represented on a mesh. The mesh is just a gridding of our 3D space: our space is

split up into a finite number of cells, and a function can be represented by a value in each cell.

8

The mesh is a 3D uniform grid of cells; the grid is nc cells wide, high, and deep. This means that the width

of a cell, wc, is given by:

wc = W

nc ,

and the volume of a cell is:

Vc = w3

c .

Note that wc is the physical width of the cell, not the coordinate width. The width of a cell in comoving

coordinates ?r is always 1

nc , since the coordinates are scaled so that the box is a unit cube. We need the

physical width and volume of the cell in order to calculate physical quantities like gradients and densities.

9

To represent a function, we need to assign a value to every cell in the mesh. To avoid doing the pair-wise

acceleration function, we want to convert the distribution of particles into a density function fl. This density

function can then be transformed into a gravitational potential, which can then be used to calculate the

gravitational field (the acceleration that a particle experiences due to gravity at a given point in space).

0.5.2.1 The density function

The density function, fl, in a given cell is calculated by counting the particles within that cell and using the

following formula:

fl = M

Vc = pcmp

Vc ,

where pc is the number of particles in the cell, and M = pcmp is the total mass of inside the cell.

0.5.2.2 The potential function

The potential, , is related to the density by the following formula:

ò2 = 4fifl,

where again we have ignored G. Calculating the potential involves solving this dierential equation, which

can be converted to a simpler equation using a Fourier transform (FT). Don’t worry about this process: we

will be using a library to perform the FT itself. All you need to know is:

1. Fourier transforms convert our function in real space f(?r) to a function in a reciprocal frequency space

?f(?

k) where k is related to the reciprocal of r. For example they might convert between time and

frequency, or distance and wavenumber.

2. Fourier transforms can be used to solve some kinds of dierential equations.

3. The frequency representation makes functions in finite spaces inherently periodic. This is ideal for our

circumstances as we are treating our box as periodic as an approximation to an infinite space with

infinite particles. We will therefore be solving the above equation with periodic boundary conditions

i.e. f(x + W) = f(x).

4. Our functions have a discrete representation on an nc ? nc ? nc mesh; the discrete Fourier transform

that we will use will likewise give us a discrete representation on an nc ? nc ? nc mesh. In real space

each cell in the mesh corresponds to a dierent position (?r), whereas in frequency space each cell

corresponds to a dierent wavenumber (?

k).

After applying Fourier transforms to both sides of our equation, we have the following frequency space

expression relating the potential and the density function:

≠k2?(?

k)=4fifl?(?

k),

where k2 = ?

k · ?

k. This can be straightforwardly solved since it is not a dierential equation:

10

?(?

k) = ≠4fi

k2 fl?(?

k),

which just involves scaling our existing fl?(?

k) in each cell in our mesh by a factor of 4fik≠2. (See the section

on FFTW for how to find the value of ?

k for each cell of the mesh in frequency space.) Once we have obtained

?(?

k) we can get the potential in real space (?r) by performing an inverse Fourier transform.

(?r) = FT≠1(?(?

k)) = FT≠1( ≠4fi

k2 FTfl(?r))

The sequence of operations computationally is then:

1. Find the density function fl(?r)

2. Apply a fourier transform to the density function to get fl?(?

k)

3. Recale fl?(?

k) to get ?(?

k)

4. Apply an inverse fourier transform to ?(?

k) to get the real space potential (?r)

0.5.2.3 The gravitational field

In order to calculate the acceleration of a particle we need the gravitational field, which is the negative of the

gradient of the potential.

?g(?r) = ≠ò? ?(r)

The gradient is calculated using a simple finite dierence method, defined by the following formula:

df

dx ¥ f(x+x)≠f(x≠x)

2 .

As with the density and potential, the gravitational field function is defined on the mesh with indices (i, j, k);

unlike the density and potential however the gravitational field at any given point is a 3D vector.

To calculate the gradient in a given cell in the mesh, we take the value of the potential in the neighbouring

cells, and approximate the gradient using the formula:

gx(i, j, k) = ≠ (i+1,j,k)≠(i≠1,j,k)

2wc

gy(i, j, k) = ≠ (i,j+1,k)≠(i,j≠1,k)

2wc

gz(i, j, k) = ≠ (i,j,k+1)≠(i,j,k≠1)

2wc

Remember that the functions in the box are periodic, so indices should wrap around if they go below 0 or

above nc ≠ 1.

11

Figure 1: Examples of finding neighbouring cells in the x direction on a periodic mesh; right shows how

indices wrap around.

To get the acceleration of particle i at position ?ri we simply need to get the value of the gravitational field at

this point:

?ai = ?g(?ri).

Since ?g is defined on our mesh like all our other functions, we simply need to find which cell in the mesh the

position ?ri falls in and then get the value of ?g there.

0.6 Making the Universe Expand

In reality the universe doesn’t just stay one size, but it expands over time. As the universe expands, our

particles move with it: remember that we are using co-moving coordinates so that positions are always

relative to the size of the box. Co-moving coordinates means:

? Positions and velocities are relative to the size of the box.

? The coordinates of the particle positions don’t change when the box changes size W. This is because

particles move apart as the universe expands.

? The velocities need to scale down when the box expands. The velocities that we keep track of are

the motion of the particles on top of the background expansion, and these shouldn’t increase along with

the size of the box.

To model a universe which expands we need to do two things:

1. Scale W by a factor F, so W(t + t) = W(t) ? F

2. Divide the velocities of all particles by F

Remember that other variables can depend on W, like the physical width of a cell wc = W

nc , and calculations

like the density and gradient rely on the physical size of a cell. During each simulation time step

you will need to make sure that you are using an up to date value for W and other variables

derived from it.

12

0.7 Summary: the Simulation Algorithm

The full simulation involves setting an initial states and looping over just five functions. It can be summarised

as follows:

1. Set up np particles with coordinates in the unit box.

? Initial positions are usually uniform random and initial velocities are 0.

2. Define the width of the box W, the expansion rate F, and the width of the mesh in cells nc.

3. Starting at t0, apply the following loop until t>tmax:

? Calculate density from particle positions (counts in cells)

? Calculate the potential:

– Forward FT applied to density

– Solve for potential (frequency space)

– Inverse FT to get (real space) potential

? Calculate gradient to get gravitational field (acceleration)

? Update particles positions and velocities

? Scale the box (scale up W and scale down velocity)

0.8 The FFTW Library

This project will make use of the very common FFTW Library. The documentation for this library can be

confusing, so we will explain how to use the relevant parts of the library. The most important thing that we

need to understand is how data is formatted going into and out of these functions, and how data gets scaled

by these functions.

We will describe the use of the FFTW library only as it applies to the specific functions and data structures

that we use in this assignment. These statements should not be taken as fully general and may not hold for

other methods in the FFTW library which have dierent properties!

13

All of the following types and methods are available in the fftw3.h header file.

0.8.1 Data order and ?

k values

FFTW can perform multi-dimensional Fourier transforms, but the data is provided as a one-dimensional

array, which means we must adopt some ordering for multi-dimensional data.

We can convert the 3D indices (i, j, k) to a flat index μ using the following expression:

μ = k + nc(j + nci),

where i is indexing along the x axis, j along the y axis, and k along z.

? We will usually write the three dimensional indexing (i, j, k) when describing what needs to be done in

this assignment, as it is easier to understand the geometry of the problem in this way.

? You will need to use this conversion to fill out your FFTW buers in practice and to understand the

arrangement of data in memory.

? This data ordering is used for both input and output buers.

In order to calculate the potential, we need to solve an equation with ?

k in it,

?(?

k) = ≠4fi

k2 fl?(?

k).

Or, treating our functions as indexed on the mesh, we could write:

?(i, j, k) = ≠ 4fi

k2(i,j,k)fl?(i, j, k)

This means that we need to associate a value of ?

k with each cell for our frequency space function.

The ?

k components (kx, ky, kz) all range over the values [0, 1

W , 2

W , ..., nc≠1

W ]. The value of ?

k at indices (i, j, k)

is:

?

k(i, j, k)=( i

W , j

W , k

W ).

0.8.2 Scaling

The Fourier transforms computed by FFTW are not normalised. For a normalised FT we would have:

FT≠1(FT(f(x))) = f(x).

However, using the fast Fourier transform as defined in FFTW, we have the following relation instead

FFTW≠1(FFTW(f(x)) = (2nc)3f(x),

for a 3D Fourier transform. (More generally, it is scaled by 2ni where ni is the number of cells in each

dimension.) Since our method involves a forwards and inverse transform, we pick up this extra factor of 8n3

c

along the way, which requires the elements in the buer to be rescaled. We’ll take care of this scaling when

we calculate the potential .

0.8.3 FFTW input / output buers

The FFT method that we will use takes arrays with elements of the type fftw_complex, which is included in

the <fftw3.h> header. This is just an alias for the type double[2], so we can access the real and imaginary

parts using code such as the following:

#include <fftw3.h>

#include <iostream>

int main()

{

fftw_complex z;

14

//set real part

z[0] = 1.5;

//set imaginary part:

z[1] = 0.2;

//print complex number

std::cout << z[0] << "+" << z[1] << "i" << std::endl;

return 0;

}

FFTW has its own methods for allocating a buer in memory, which should be used whenever using the

library.

The following line allocates an array called buffer to contain N_elements values of type fftw_complex. The

type of buffer is fftw_complex *, which is a pointer i.e. it is implemented as a C-style array.

fftw_complex * buffer = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * N_elements);

Note that this does not initialise the elements in the buer.

? The FT has separate input and output buers, and both must be allocated before the FT is called.

? Whatever is in the output buer will be rewritten with the result of the FT.

? The input and output buers are the same size.

Buers which are allocated with fftw_malloc must also be freed using fftw_free, for example:

fftw_complex * buffer = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * N_elements);

fftw_free(buffer);

0.8.4 FFTW plans and executing the Fourier transform

FFTW uses “plans” to describe how the Fourier transform that needs to be performed. Plans determine how

many dimensions the data has (ours is three dimensional), what the input and output buers will be, and

whether the transform is a forward or inverse Fourier transform. The plans do not need to be re-made every

time you want to perform the transform: you can re-use a plan as many times as you like as long as

these properties don’t change.

The plans have type fftw_plan, and are declared, allocated, and deallocated as follows:

Declaration and Allocation:

The forward plan is initialised using the fftw_plan_dft_3d function with the following arguments:

fftw_plan forward_plan = fftw_plan_dft_3d(n_x,

n_y,

n_z,

input_buffer,

output_buffer,

FFTW_FORWARD,

FFTW_MEASURE);

? n_x, n_y, and n_z are the size of the three dimensions. These should all be nc since we are using a box

with an equal number of cells in each direction.

? input_buffer is the function to which you want to apply the forward transform. This is a fftw_complex

* type array, and is a real space function f(?r).

? output_buffer is where you stored the result of the transform. This is also a fftw_complex * type

array, and is a frequency space function ?f(?

k).

15

? FFTW_FORWARD and FFTW_MEASURE are constants defined in fftw3.h. You should keep these as they are!

The inverse plan is initialised similarly:

fftw_plan inverse_plan = fftw_plan_dft_3d(n_x,

n_y,

n_z,

input_buffer,

output_buffer,

FFTW_BACKWARD,

FFTW_MEASURE);

? Note that FFTW_FORWARD has been changed to FFTW_BACKWARD

? The input buer should be a frequency space representation g?(?

k)

? The output buer is a real space representation g(?r)

Deallocation:

Both plans need to be deallocated using the fftw_destroy_plan function:

fftw_destroy_plan(forward_plan);

fftw_destroy_plan(inverse_plan);

1 Simulating Particles in a Box (60 marks)

In the first part of this assignment you’ll create a serial (not parallelised) particle-mesh simulation. We will

specify the representation of some functions due to the need to interface with the FFTW library, but for the

most part selecting data structures will down to you, along with appropriate naming and code organisation.

You should think carefully about your use of functions and classes to structure your code.

You should aim to make your code as performant as you can, but still clear, safe, and well structured. You

may assume that optimisations will be turned on to a high level (-O3) for a release build. (If debugging you

should usually keep the optimisations o, unless you are explicitly using the debugger to see how the compiler

has optimised the code.)

We will explicitly ask you to write a Simulation class and a number of functions, but you can, and should,

add further classes and function where you feel that they are appropriate and useful.

1.1 The Initial Repository

Start by familiarising yourself with the starter repository.

There are multiple folders which contain source files:

1. lib for library code (this is where the bulk of your simulation code will be developed).

? include contains library headers.

2. apps for application code.

3. test for test code.

4. benchmarks for benchmarking (timing) code.

You will find in the source files:

1. A skeleton Simulation class that you will need to complete.

2. Some partially coded tests that you will need to adapt to use your classes / data structures.

3. A method for converting density functions to images (.pbm format)

4. A method for calculating a radial correlation function, which will be used to calculate summary statistics

later on and which you may need to adapt to work with your data structures.

You should pay attention to the CMakeLists.txt files as well.

16

1.2 Representing Particles (8 marks)

It is up to you to decide how to represent your particles. You will need to be able to represent a large number

of them (in the millions).

? Each particle must have its own position (?r) and velocity (?v), which are three dimensional.

? Positions are always measured relative to the size of the box. This means that each coordinate of

the position is 0 ? ri < 1, regardless of the width of the box.

? All particles have the same mass, which is constant throughout the simulation.

The initial collection of particles will need to be passed in to the Simulation class. You need to be able to

create custom initial states as well as random ones.

? You must be able to straightforwardly create a custom distribution of particles, for example one particle

in the middle of the box. This will be important for constructing test cases.

– You must be able to set the position of every particle individually, but you may assume that the

initial velocities are all zero.

? You must also write a function or constructor which creates a random distribution of particles. In this

case:

– The position of each particle must be uniform random (between 0 and 1) for each dimension (x, y,

z).

– The velocity of each particle must be zero in all dimensions.

– The seed for the random number generator must be settable. This is vital for reproducability.

TASKS:

1. Decide on your particle implementation and write any class or function definitions that you need.

2. In Responses.md explain the choices that you have made, and point out where the relevant code

can be found. You should consider both ease of use and performance expectations. (You may wish to

come back to this response later if you decide to try more than one approach.)

1.3 The Simulation Class (7 marks)

In your library code you will find a class called Simulation. This class will be used to represent a single

simulation which can be run, including holding relevant data like the state of the particles in the simulation,

the timestep, the length of the simulation (tmax), and the various functions defined on the mesh.

? You may want to build this class representation as you work through the assignment, adding member

variables as you encounter new things to be represented.

? You may make any functions in this class that you want to test public, in order to make

constructing tests and benchmarks practical. For data, or functions which do not require direct

tests themselves, you should make normal use of access specifiers.

TASKS

1. You should add a constructor to your Simulation class which takes the following information:

? The total time tmax.

? The time step t.

? The initial collection of particles.

? The width of the box W.

? The number of cells wide that the box is, nc.

– Remember that this is just the size of the box in one dimension; the total number of cells in

the 3D space will be Nc = n3

c .

? The expansion factor F.

– This is the factor that the universe will expand by each timestep.

You should use your constructor to set and initialise any member variables that you want to keep in the

class. Below is some advice on what data structures are needed for the Fourier transform part of the

code. You should consult also the section of the Background about the FFTW library and its usage.

17

FFTW Buers and Plans

Your simulation class will internally use FFTW as a library for calculating the fast Fourier transforms

necessary for this approach. These methods require storing both the input and output buers, which are the

discrete representations of our density and potential functions on the mesh, and FFTW “plans”.

1.3.0.1 Buers

You will need three data buers for the FT parts:

? Two for real space functions (density fl(x) and potential (x)).

– In principle we could use just one buer, since we don’t need to use fl and  at the same time, but

using two will help clarity as it prevents ambiguity over what is contained in the buer at any

given time. This also allows us to keep both the density and potential functions until the end of

an iteration, which can be useful for outputs and debugging. If you are worried about your total

memory usage however, you can use just one buer.

? One for frequency space values. This will only be used during the calculation of the potential function,

and will be used to represent both fl?(k) and ?(k). After the forward transform, this buer will contain fl?.

This buer then gets modified (in place) to represent ?, before being input into the inverse transform.

– Using one buer here is less confusing because it only needs to be used within one single function.

If we were interested in using the frequency space representations elsewhere, and memory usage

was not a major concern, then we might consider splitting this into two buers as well.

Each of the three buers will be of the size n3

c .

1.3.0.2 Plans

You will need to use two plans: one for the forward transform and one for the inverse transform.

? The forward transform must take the real space density buer as input and the frequency space buer

as output.

? The inverse transform must take the frequency space buer as input and the real space potential buer

as output.

1.4 Calculating the density function (7 marks)

The density function, as with all functions in our 3D space, is defined on our nc ? nc ? nc mesh. The density

function is defined by a single value in each cell of this 3D gridding, and is stored in one of our fftw_complex

* buers. This buer is a contiguous piece of memory, i.e. all n3

c values (of type fftw_complex) are stored in

sequential memory addresses.

Our mesh is divided into cells with indices (i, j, k) where:

? i ranges over 0, 1, . . . , nc ≠ 1 and represents cell indices in the x coordinate,

? j ranges over 0, 1, . . . , nc ≠ 1 and represents cell indices in the y coordinate,

? k ranges over 0, 1, . . . , nc ≠ 1 and represents cell indices in the z coordinate.

Since our coordinate system is scaled so that x, y, z are always between 0 and 1, the cell at indices (i, j, k)

covers coordinates in 3D space where:

? 1

nc i ? x < 1

nc (i + 1),

? 1

nc j ? y < 1

nc (j + 1),

? 1

nc k ? z < 1

nc (k + 1).

18

The corresponding index μ of the buer is:

μ = k + nc(j + nci).

The value of in the density function for a given cell with indices (i, j, k) is equal to the number of particles

whose position falls within that cell, multiplied by mp

Vc where Vc = w3

c is the volume of the cell. (N.B. The

volume of the cell Vc is calculated using wc = W

nc i.e. does not use scaled coordinates. As the

size of the box changes, the volume of a cell also changes.)

TASKS

1. Write a function in your Simulation class to calculate the density function and fill in the buer.

? The imaginary part of every value in the density buer should be set to 0 since the density

is a real function.

? Bear in mind that this function will be called every time step.

2. Test your density function. You should check that:

? All imaginary components are 0.

? Density is everywhere 0 if there are no particles.

? Density is correct when a single particle is present.

– Only a single cell should therefore have non-zero density. You should check that the value and

index of this is correct.

? Density is correct when multiple particles are present.

1.5 Calculating the Potential (7 marks)

In order to calculate the potential we need to make use of our Fourier transform.

Calculating the potential involves 3 steps:

1. Apply the forward Fourier transform (input: fl(x), output: fl?(k))

2. Apply scaling to the frequency space representation (see below).

3. Apply the inverse Fourier transform (input: ?(k), output: (k))

Applying the forward/inverse transform simply requires calling fftw_execute (defined in fftw3.h) with the

correct plan:

fftw_execute(forward_plan);

// ...

// manipulate the frequency space buffer

// ...

fftw_execute(inverse_plan);

After the FFT is complete the output buer will contain the non-normalised transformed density function

(see the FFTW section in the Background). We can use this to calculate the potential by scaling this buer

by k≠2.

The ordering of elements in the frequency space representation is the same as the real space array. In the

frequency representation, a function is represented as a sum of periodic functions, which oscillate with dierent

frequencies in the x, y, and z directions. How many quickly the function oscillates in each direction is given

by the vector ?

k. We need to associate a vector ?

k with every cell in our mesh.

The value of ?

k for indices (i, j, k) is:

?

k = (kx, ky, kz) = 1

W (i, j, k)

k2 = ?

k · ?

k = 1

W2 (i

2 + j2 + k2)

19

(?

k) = ≠4fi

k2 fl(?

k)

? The first value of k2 is 0, which would produce a singular result due to the division. Instead of dividing

the density by 0 here, we should just set this element of the buer to 0.

– This does not aect the result of the simulation since this is the “0 frequency” component of the

potential i.e. it represents a constant function in space with no oscillations at all. The impact that

it has on the real-space representation of the function is therefore just an additive constant; since

we only care about the potential in order to take a gradient to find the acceleration, this constant

does not contribute at all to our dynamics and we can ignore it.

? You must apply this scaling to both the real part and imaginary part of frequency space buer,

since we have used a complex FT.

There is also a multiplicative normalisation factor to apply:

fnorm = (8n3

c )≠1.

This needs to be applied to the real and imaginary part of the entire buer as well.

Finally, calculate the real space potential by transforming back using the inverse Fourier transform.

? Once we have transformed back, we can treat the potential function as being real, and we will ignore

any values in the imaginary part of the buer.

TASKS

1. Add a function to your Simulation class to calculate the potential function.

2. Test your potential function using the partially coded example in test_simulation.cpp.

? You will need to fill out this test to make use of the implementations that you have defined so far.

? This test for the potential is very approximate but should be able to determine whether your calculation

is happening correctly.

? It looks at the potential due to a single point particle in the middle of your box.

? Because the conditions are periodic, this is actually the potential due to an infinite series of particles

spaced W apart from one another!

? The test builds a very approximate expected potential function along the x-axis of the box and compares

it to the potential function that you calculate.

? We only care about the real component of the potential buer; the imaginary component can be ignored.

IMPORTANT: The expected potential function should be negative. If your partially coded example does

not have a - sign in line 47 (double expected_pot = -mass *(1/ dx1 + 1/dx2 + 1/dx3);) then please

add one in.

You may also want to inspect what your potential function looks like by saving the potential function along

this axis and making a graph to check that it has the right shape.

1.6 Calculating acceleration (7 marks)

The gravitational field in space is just the negative of the gradient of the potential function. Since this is a

3D space, the field is a vector ?g = (gx, gy, gz) = ≠( d

dx , d

dy , d

dz ), so there are three values defined for every cell

in the mesh.

Note that the gravitational field is real valued, and when calculating the gradient you should ignore the

imaginary part of the potential buer.

TASKS

1. Write a function to calculate the gradient of a function defined on your mesh using the finite dierence

method in the background.

? This should take the function to be dierentiated as an argument to make it easier to test.

? Remember that the functions in the box are periodic, so indices should wrap around if they go below

0 or above nc ≠ 1.

20

2. Test your gradient function.

? You do not need to test that the gravitational field for a particle is physically accurate. The function

merely calculates a gradient of an arbitrary input.

? You should test that your function correctly calculates the (approximate) gradient for some test case(s).

1.7 Updating particles (5 marks)

You now need to be able to update the state of your particles according to their acceleration and velocity.

This is called after the gravitational field functions have been calculated for a given timestep.

TASKS

1. Write code to update the acceleration, velocity, and position of each of your particles.

? The acceleration of a particle should be value of the gravitational field in the cell that the particle falls

within.

? Calculate the new velocity as ?v? = ?v + ?a ? t.

? Calculate the new position as ?r? = ?r + ?v? ? t. (This should use the updated velocity value.)

2. Test your particle update function.

1.8 Expanding the box (3 marks)

We model the expansion of the universe by scaling up the width of the box. This does not change the

position of the particles, as they move with the expansion of the box.

TASKS

1. Write a function which expands the box:

? Multiply the width of the box W by the expansion factor F.

? Divide the velocities of all particles by the expansion factor F.

? If you store any other variables which depend on scale, for example the width of a cell, then these

also need to be updated.

1.9 The simulation loop (2 marks)

It’s now time to complete the simulation class by writing a loop which links all our functions together and

simulates the evolution of our system over time.

*TASKS

1. Complete the Simulation::run function to implement the simulation loop, starting at t = 0 and

stepping by t until t ? tmax.

In each step you will need to (in this order):

? Calculate the density function

? Calculate the potential

? Calculate the acceleration

? Update the particles

? Expand the box

1.10 Complexity (6 marks)

We mentioned earlier that the particle-particle (PP) method has O(n2

p) complexity due to its pair-wise

calculations. Having implemented all steps of the particle-mesh (PM) method you should now be able to

comment on the complexity of the PM approach. Unlike PP, the PM approach has two variables which

determine the size of the problem: the number of particles and the number of cells in the mesh. You can

consider how the problem scales with each of these separately.

21

TASKS

In Responses.md, use Big-O notation to give the time complexity with respect to Nc, the total number of

cells (Nc = n3

c ), and np (the total number of particles) for the following routines in your program:

1. the density calculation,

2. the potential calculation,

? You may assume that the execution of the FFT is O(n log(n)) for an input/output buer of total size n.

3. the gravitational field calculation (calculating the gradient of the potential),

4. and the particle update function.

Explain your reasoning in Responses.md.

Given that the number of particles must always be larger than the total number of cells in the mesh (in order

to achieve a smooth density function), do you expect the Particle-Mesh method to scale better or worse than

the Particle-Particle method for large simulations?

1.11 Application 1: Visualising A Developing Universe (8 marks)

The first application that you will write will produce a series of images like those below:

Each of these images is a visualisation of the density function on the mesh at a point in time during the

simulation. The simulation starts from a uniform random distribution and particles with 0 velocity, and

evolves for a period of time.

The functionality to produce the images from the denisity buer is provided for you.

TASKS

1. In app, complete the command line application NBody_Visualiser.cpp. If the -h flag is used it should

display a help message; in this case the simulation should not run. Otherwise, it should take the

following command line arguments:

? -nc: the number of cells wide the box is (nc),

? -np: the average number of particles per cell,

? -t: the total time,

? -dt: the time step,

? -F: the expansion factor,

? -o: an output folder,

? -s: and the random seed.

2. It should set up and run a simulation with the following properties:

? The total number of particles np is n3

c multiplied by the average particles per cell (given in the

command line arguments).

? The initial distribution of particles must be uniform random using the given random seed.

? The width of the box W should be 100.

? The mass of the particles should be 105

np . For example, if there are 10 million particles, the mass

should be 0.01. This keeps the total amount of mass the same regardless of the number of particles

that you run with.

22

3. Modify the run method to take an std::optional<string> which defaults to nullopt. If not null,

the run method should save images of your density distribution to the given output folder every 10

timesteps. Your application should call this run function with the output folder string provided by the

command line arguments.

? A function to save your density distribution as an image is provided in Utils.hpp/Utils.cpp.

4. Your images should be clearly named so that their order is obvious.

Run your program (./build/bin/NBody_Visualiser <args>) and record the results with the following

recommended settings:

? Expansion factor = 1.0, and expansion factor = 1.02. This is the only parameter which should change

between recorded runs.

? Number of cells wide = 101. (If you program is too slow then you can scale this down.)

? Particles per cell = 10.

? Total time = 1.5.

? Timestep = 0.01.

? Fixed random seed of your choice. (This means that both runs should have identical initial states. The

final state should dier due to the dierence in expansion.)

The images from each run should be saved into appropriately named subfolders of the Images folder.

Make sure you have optimisations turned on, but if your program runs too slowly then you can adjust

the number of cells down. You should not have the particles per cell any lower than 10 in order to get a

smooth density function; if your code runs quickly, feel free to increase the particles per cell to get a more

detailed simulation!

Document in your Responses.txt exactly what parameters you used to run, and where to find the images.

2 Parallelising Your Simulation (30 marks)

In this part of the assignment you will use OpenMP to parallelise your simulation code, and MPI to run

ensembles of simulations. You will also use timing code to benchmark your simulations and explore how the

performance scales. You will only need to be concerned with parallelising you own code: you can ignore

the FFTW library calls and the helper functions which we have provided for your use (SaveToFile and

CorrelationFunction).

2.1 Shared Memory Parallelism (15 marks)

You should go through the functions you have written for your simulation and decide which you can parallelise

with OpenMP.

The loop in run is inherently serial, since each iteration of the loop depends on the final state of the iteration

before it. The functions inside the loop also cannot be executed out of order. However, you can look into

each of the functions that you call from inside the loop to see if they themselves can be parallelised.

TASKS

1. Write in Responses.md each of the functions which you identify as being possible to parallelise.

2. Use OpenMP to parallelise these loops if you can.

? If you think something can be parallelised in principle but you are not sure how to, make a note of it in

your Responses.md. Describe what you think you would need to be able to in order to implement such

a parallelisation.

3. Benchmark (see below) these pieces of code using 1-4 threads (or more, if available on your CPU).

4. Don’t forget to run your tests after parallelising to make sure you haven’t broken anything! You should

also re-run your simulation and check that you get the same results with multi-threaded and single

threaded runs.

23

Do NOT parallelise the random initialisation of the particle positions. This would introduce

non-determinism in the assignment of the positions and prevent your simulation from being

reproducable. This MUST remain serial.

Benchmarking

Benchmarks should be implemented in benchmarks.cpp. In this file we have provided a class called

BenchmarkData which has some built in methods to time your functions as well as print out information.

You should look carefully at this class and use it to print clear and useful information about your timing

experiments. Alternatively, you can write your own code to do this.

For each function which you parallelise:

? Set the number of threads using omp_set_num_threads defined in omp.h. This is equivalent to setting

the OMP_NUM_THREADS environment variable, but allows us to change this from inside the code.

? When setting up simulations to benchmark, use properties (number of cells and particles) which are the

same as when you ran your app, or larger if you need more time to get an accurate result. Make sure

that any properties that you use which might aect the time of the benchmark are properly reported

(e.g. by setting the info property in the BenchmarkData class.)

? Report the timing of each of the functions for dierent numbers of threads in Responses.md.

? If you experiment with using collapse or schedule, then include the results of these experiments in

your Responses.md as well. Any collapse or schedule statements which are not useful may be removed

from your program, but should be properly reported in the responses.

If an attempt to parallelise a function lead to an increase in time, or result in failed tests, then report this in

Responses.md and then comment out the OMP pragma. Do not delete it entirely, so that we can see

you attempts at parallelisation.

2.2 Application 2: Comparing Universes with Distributed Memory (15 marks)

In this section you will create a new application which uses MPI to run multiple simulations in separate

processes. Each process will run its own simulation with the same initial particle distribution, but a dierent

expansion factor. Once the simulation is complete, the process will calculate a summary statistic (the radial

correlation function), which is represented as a list of values. One process, the master process, will gather the

data from the other processes and output it to a single file so that it can be plotted.

The number of processes is determined by the MPI run command:

mpirun -np <num_processes> <application> <args>

e.g.

mpirun -np 4 ./build/bin/NBody_Comparison <args>

would run the application with 4 processes.

TASKS

1. Your program should take command line arguments for:

? An output folder,

? a minimum expansion factor,

? a maximum expansion factor.

1. You should create one simulation per process. These should dier by their expansion factor F; the

dierent expansion factors should be evenly spaced between the minimum and maximum given in the

command line arguments.

24

? You will need to get the number of processes to work out this spacing.

2. Each simulation should otherwise have the same parameters as your earlier imaging runs.

? These should be hard coded to simplify the command line interface.

? Every simulation must start with the same initial state (initial particle distribution).

3. Each process should run a simulation; you should pass nullopt to the run method now as we don’t

want our MPI programs to output images.

4. Once a process has finished running its simulation, it should use the correlationFunction method

provided in Utils.hpp / Utils.cpp in order to calculate a summary statistic (the log of the radial

correlation function) for the end point of this simulation.

? You can modify this function and/or move it in order to make it work with your choice of data structures

and access specifiers. Be sure to note what you’ve done in your Responses though so it is easy to find.

? The function as provided takes a list of positions as a vector<array<double, 3>>, and a number of

bins as an int. As with the previous coursework, the number of bins is essentially the resolution of the

histogram that this function calculates. It returns the log of the calculated correlation function as a

vector<double> of length n_bins.

? You should call this with n_bins of around 100.

5. Each non-master process should send its correlation function to the master process.

? You can get the pointer to the underlying data buer for a vector<double> using v.data() for a

vector v.

6. The master process should then output all the correlation functions to a csv file.

? The first row (header) should label each column with the expansion factor

? Each correlation function should then be a column

Run your mpi program with at least 2 processes and:

? Report your command line arguments in the Responses.md.

? Save the csv file in the Correlations folder.

? You should then make a plot of this data using python or gnuplot, which should also be saved to the

Correlations folder.

? The plot should feature all of the correlations as lines on a single plot, labelled with their correct

expansion factor.

3 Additional Considerations (10 marks)

As in the first assignment, you are expected to engage in good software engineering practices. Specifically,

you will be marked on:

? Appropriate naming for functions and classes.

? Good code organisation (i.e. appropriate choice of files and folders)

? Informative and useful git commit log.

? Eective in-code commenting.

? Code formatting.

You should make sure that you README.md contains clear and correct instructions for building and using your

code, including examples.

Your Responses.md must clearly label which section of the assignment you are responding to for

every response you give.

Hints:

? Familiarise yourself with writing good git commit messages.

? Familiarise yourself with Best practices for writing code comments.

25


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp