Hints for submission 2
Architecture
It is possible to extend the existing architecture by new classes that handle the parallel case, while the old classes for the serial case still remain functional.
For example, new parallel output writers, OutputWriterParaviewParallel and OutputWriterTextParallel together with a starting point for a Partitioning class are provided here.
Similarly, a ComputationParallel class can be defined that inherits from the old Computation class and overloads the public runSimulation method as well as some of the protected methods.
In the submission of exercise 2 it is not required to produce different binaries for serial and parallel execution. The new, parallel binary could always be run with only 1 process, if serial execution is desired. This means, you are also allowed to directly change the classes of the serial implementation.
It is beneficial to define a new class that encapsulate functionality corresponding to subdomain handling. This new class, e.g. called Partitioning, should be able to tell the own rank number and the rank numbers of neighbouring processes, know whether the own subdomain touches one of the boundaries left, right, top or bottom of the global domain and it should know the number of cells in the local staggered grid of the own subdomain.
The classes used for the discretization should now only store the local data of the subdomain that belongs to the own processor, plus some ghost layers. This means there is not much change.
Before starting the implementation, think about the bounds for the \(u, v\) and \(p\) grids of a subdomain, for the case when the subdomain is in the interior of the whole domain as well as when it touches one of the outer boundaries, top, bottom, left or right. This is an important step for exercise 2, so invest some time in it to understand the requirements properly before starting to program.
You could also define classes that only handle data transfer (“ghost value exchange”) between neighbouring subdomains, for \(p\), \(u,v\) and \(F,G\).
Pressure solver
A new implementation of the SOR solver is also necessary. Use the two-color scheme for parallelization. Note the difference between the serial implementation of the solver and the red-black scheme. The obtained solutions of the pressure equation should be the same, within in the solver tolerance, but the operations to compute it are different in exercise 1 and exercise 2. This should be remembered when you compare your programs and output for the two exercises.
To obtain even faster computations, it is possible to implement a parallel conjugated gradient method, but this is not required.
Sizes of fields
Care has to be taken to have the correct size of the field variables. The example in Fig. 8 shows a 4x7 domain partitioned to two subdomains for MPI rank 0 and MPI rank 1.
Fig. 8 Size of fields and ghost layer for an example of 4x7 cells and two subdomains.
The light green points visualize the locations of the velocities \(v\). They have to be computed on only one rank. Especially on the cut boundary of the subdomain in the middle it has to be defined which of the two ranks owns these values of \(v\). For data exchange between the subdomains, we need one line of ghost values, as depicted by the dark-green points. As can be seen in Fig. 8, this means that there is an additional layer of ghost cells at the bottom boundary of the subdomain for rank 1, but none at the top boundary of the subdomain for rank 0. The exact same considerations hold for the fields for \(G\) and analogous for \(u\) and \(F\).
In the computation, the loops over the indices i and j can remain the same as in the serial implementation, if the proper bounds, e.g. by methods uIBegin, uIEnd, uJBegin, uJEnd and the analogous ones for v have been used. But note that the methods uIBegin and vJBegin now depend on the presence of the mentioned additional ghost layer.
Scenarios
The scenarios to test the submission will have the parameters of the lid-driven cavity, i.e. \(Re=1000\), \(dx = dy\), etc. The number of cells and physicalSize, however, may be altered. The first test case will be the original lid-driven cavity with 2 processes.
You can test if your program is correct by using the program compare_output. It takes two output folders as arguments which contain vti files and computes the average velocity error.