Submission Exercise 2

Parallelize the flow solver using domain decomposition and MPI. It should be as fast as possible. The program is assumed to be correct, if it produces the same results as in the serial version.

The procedure for the submission is similar to exercise 1. Again, upload a zip archive in the submission system. The following commands should compile and run your program:

unzip submission.zip                                   # this unzips the uploaded archive, filename may be different
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make install
mpirun -n 2 ./numsim_parallel lid_driven_cavity.txt    # "2" will be replaced by the number of processes for the test case, different test cases will be simulated.

Note that the executable should now be called `numsim_parallel`. The number of MPI ranks is guaranteed to be a multiple of 2. If you implemented a CG-solver (not required), hardcode this selection of the solver, as the parameter file will again contain the line “pressureSolver = SOR”.

The result has to be written to “.vti” output files using the provided parallel output writer. There should be one output file every simulation second. For example, if \(t_{endTime}=10\), the program should output 10 or 11 output files (the output for t=0 is optional). The “.txt” file output is not needed and should be disabled as it only slows down the program.

The same rules as in exercise 1 apply: No external libraries except VTK and MPI are allowed, no hacking, reasonable object-orientation and comments in the code are necessary.

In this exercise, if the result is correct, a smaller runtime will give a higher score. .. (We plan to make a “highscore” of the 3 fastest submissions public to all groups.)

Questions for the interview (“Abnahme”)

How do the durations for computation and communication relate to the number of processes?
How do the durations for computing the residual norm and determining the time step width relate to the number of processes?
How is the strong scaling of your program? Bring a plot of the parallel efficiency over the number of processes.