|
MFC
High-fidelity multiphase flow simulation
|
MFC can be run using mfc.sh's run command. It supports interactive and batch execution. Batch mode is designed for multi-node distributed systems (supercomputers) equipped with a scheduler such as PBS, SLURM, or LSF. A full (and up-to-date) list of available arguments can be acquired with ./mfc.sh run -h.
MFC supports running simulations locally (Linux, MacOS, and Windows) as well as several supercomputer clusters, both interactively and through batch submission.
If you installed MFC via Homebrew, run cases with the mfc wrapper:
-n X to control the number of MPI processes (ranks).mfc <case.py> to run cases.build, test, clean, etc.), clone the repository and use ./mfc.sh.-t pre_process simulation, -n, and others; it always runs with preinstalled binaries.$(brew --prefix mfc)/examples/.-c <computer name> on the command line to instruct the MFC toolchain to make use of the template file toolchain/templates/<computer name>.mako. You can browse that directory and contribute your own files. Since systems and their schedulers do not have a standardized syntax to request certain resources, MFC can only provide support for a restricted subset of common or user-contributed configuration options. srun. You might need to invoke mpirun instead.-c <computer name> is left unspecified, it defaults to -c default.Please refer to ./mfc.sh run -h for a complete list of arguments and options, along with their defaults.
To run all stages of MFC, that is pre_process, simulation, and post_process on the sample case 2D_shockbubble,
If you want to run a subset of the available stages, you can use the -t argument. To use multiple threads, use the -n option along with the number of threads you wish to use. If a (re)build is required, it will be done automatically, with the number of threads specified with the -j option.
For example,
The MFC detects which scheduler your system is using and handles the creation and execution of batch scripts. The batch engine is requested via the -e batch option. The number of nodes can be specified with the -N (i.e., --nodes) option.
We provide a list of (baked-in) submission batch scripts in the toolchain/templates folder.
Other useful arguments include:
-# <job name> to name your job. (i.e., --name)-@ sample@example.com to receive emails from the scheduler. (i.e., --email)-w hh:mm:ss to specify the job's maximum allowed walltime. (i.e., --walltime)-a <account name> to identify the account to be charged for the job. (i.e., --account)-p <partition name> to select the job's partition. (i.e., --partition)As an example, one might request GPUs on a SLURM system using the following:
Disclaimer: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to allocate resources. Therefore, the MFC constructs equivalent resource sets in the task and GPU count.
MFC provides two different arguments to facilitate profiling with NVIDIA Nsight. Please ensure the used argument is placed at the end so their respective flags can be appended.
./mfc.sh run ... -t simulation --nsys [nsys flags] allows one to visualize MFC's system-wide performance with NVIDIA Nsight Systems. NSys is best for understanding the order and execution times of major subroutines (WENO, Riemann, etc.) in MFC. When used, --nsys will run the simulation and generate .nsys-rep files in the case directory for all targets. These files can then be imported into Nsight System's GUI, which can be downloaded here. To keep the report files small, it is best to run case files with a few timesteps. Learn more about NVIDIA Nsight Systems here../mfc.sh run ... -t simulation --ncu [ncu flags] allows one to conduct kernel-level profiling with NVIDIA Nsight Compute. NCU provides profiling information for every subroutine called and is more detailed than NSys. When used, --ncu will output profiling information for all subroutines, including elapsed clock cycles, memory used, and more after the simulation is run. Adding this argument will significantly slow the simulation and should only be used on case files with a few timesteps. Learn more about NVIDIA Nsight Compute here../mfc.sh run ... -t simulation --rsys --hip-trace [rocprof flags] allows one to visualize MFC's system-wide performance with Perfetto UI. When used, --roc will run the simulation and generate files in the case directory for all targets. results.json can then be imported in Perfetto's UI. Learn more about AMD Rocprof here It is best to run case files with few timesteps to keep the report file sizes manageable../mfc.sh run ... -t simulation --rcu -n <name> [rocprof-compute flags] allows one to conduct kernel-level profiling with ROCm Compute Profiler. When used, --rcu will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run. Adding this argument will moderately slow down the simulation and run the MFC executable several times. For this reason, it should only be used with case files with few timesteps.When running a simulation, MFC generates a ./restart_data folder in the case directory that contains lustre_*.dat files that can be used to restart a simulation from saved timesteps. This allows a user to simulate some timestep $X$, then continue it to run to another timestep $Y$, where $Y > X$. The user can also choose to add new patches at the intermediate timestep.
If you want to restart a simulation,
t_step_start : $t_i$t_step_stop : $t_f$t_step_save : $SF$ in which $t_i$ is the starting time, $t_f$ is the final time, and $SF$ is the saving frequency time. For a simulation that uses adaptive time-stepping, set up the initial case file with:n_start : $t_i$t_stop : $t_f$t_save : $SF$ in which $t_i$ is the starting time, $t_f$ is the final time, and $SF$ is the saving frequency time.pre_process and simulation on the case../mfc.sh run case.py -t pre_process simulation./restart_data.restart_case.py), which should have:m, n, and p:(xyz)_domainbeg(xyz)_domainendstretch_(xyz)a_(xyz)(xyz)_a(xyz)_bWhen using a constant time-step, alter the following:
t_step_start : $t_s$ (the point at which the simulation will restart)t_step_stop : $t_{f2}$ (new final simulation time, which can be the same as $t_f$)t_step_save : ${SF}_2$ (if interested in changing the saving frequency)If using a CFL-based time-step, alter the following:
n_start : $t_s$ (the save file at which the simulation will restart)t_stop : $t_{f2}$ (new final simulation time, which can be the same as $t_f$)t_save : ${SF}_2$ (if interested in changing the saving frequency)old_ic : 'T' (to specify that we have initial conditions from previous simulations)old_grid : 'T' (to specify that we have a grid from previous simulations)t_step_old : $t_i$ (the time step used as the t_step_start of the original case.py file)num_patches to reflect the number of ADDED patches in the restart_case.py file. If no patches are added, set num_patches: 0case.py file) removed.patch_icpp(1)all variablespatch_icpp(2)all variablespatch_icpp(num_patches)all variablespatch_icpp(1)some variables of interest restart_case.py./mfc.sh run restart_case.py -t pre_process simulationt_step_start, t_step_stop] range, with t_step_save as the spacing between files.t_step_stop to the restarting point $t_s$ in case.py. Then, run the commands below. The first command will run on timesteps $[t_i, t_s]$. The second command will run on $[t_s, t_{f2}]$. Therefore, the whole range $[t_i, t_{f2}]$ will be post processed.We have provided an example, case.py and restart_case.py in /examples/1D_vacuum_restart/. This simulation is a duplicate of the 1D_vacuum case. It demonstrates stopping at timestep 7000, adding a new patch, and restarting the simulation. To test this code, run: