MFC can be run using mfc.sh
's run
command. It supports interactive and batch execution. Batch mode is designed for multi-node distributed systems (supercomputers) equipped with a scheduler such as PBS, SLURM, or LSF. A full (and up-to-date) list of available arguments can be acquired with ./mfc.sh run -h
.
MFC supports running simulations locally (Linux, MacOS, and Windows) as well as several supercomputer clusters, both interactively and through batch submission.
-c <computer name>
on the command line to instruct the MFC toolchain to make use of the template file toolchain/templates/<computer name>.mako
. You can browse that directory and contribute your own files. Since systems and their schedulers do not have a standardized syntax to request certain resources, MFC can only provide support for a restricted subset of common or user-contributed configuration options. srun
. You might need to invoke mpirun
instead.-c <computer name>
is left unspecified, it defaults to -c default
.Please refer to ./mfc.sh run -h
for a complete list of arguments and options, along with their defaults.
To run all stages of MFC, that is pre_process, simulation, and post_process on the sample case 2D_shockbubble,
If you want to run a subset of the available stages, you can use the -t
argument. To use multiple threads, use the -n
option along with the number of threads you wish to use. If a (re)build is required, it will be done automatically, with the number of threads specified with the -j
option.
For example,
The MFC detects which scheduler your system is using and handles the creation and execution of batch scripts. The batch engine is requested via the -e batch
option. The number of nodes can be specified with the -N
(i.e., --nodes
) option.
We provide a list of (baked-in) submission batch scripts in the toolchain/templates
folder.
Other useful arguments include:
-# <job name>
to name your job. (i.e., --name
)-@ sample@example.com
to receive emails from the scheduler. (i.e., --email
)-w hh:mm:ss
to specify the job's maximum allowed walltime. (i.e., --walltime
)-a <account name>
to identify the account to be charged for the job. (i.e., --account
)-p <partition name>
to select the job's partition. (i.e., --partition
)As an example, one might request GPUs on a SLURM system using the following:
Disclaimer: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to allocate resources. Therefore, the MFC constructs equivalent resource sets in the task and GPU count.
MFC provides two different arguments to facilitate profiling with NVIDIA Nsight. Please ensure the used argument is placed at the end so their respective flags can be appended.
./mfc.sh run ... -t simulation --nsys [nsys flags]
allows one to visualize MFC's system-wide performance with NVIDIA Nsight Systems. NSys is best for understanding the order and execution times of major subroutines (WENO, Riemann, etc.) in MFC. When used, --nsys
will run the simulation and generate .nsys-rep
files in the case directory for all targets. These files can then be imported into Nsight System's GUI, which can be downloaded here. To keep the report files small, it is best to run case files with a few timesteps. Learn more about NVIDIA Nsight Systems here../mfc.sh run ... -t simulation --ncu [ncu flags]
allows one to conduct kernel-level profiling with NVIDIA Nsight Compute. NCU provides profiling information for every subroutine called and is more detailed than NSys. When used, --ncu
will output profiling information for all subroutines, including elapsed clock cycles, memory used, and more after the simulation is run. Adding this argument will significantly slow the simulation and should only be used on case files with a few timesteps. Learn more about NVIDIA Nsight Compute here../mfc.sh run ... -t simulation --roc --hip-trace [rocprof flags]
allows one to visualize MFC's system-wide performance with Perfetto UI. When used, --roc
will run the simulation and generate files in the case directory for all targets. results.json
can then be imported in Perfetto's UI. Learn more about AMD Rocprof here It is best to run case files with few timesteps to keep the report file sizes manageable../mfc.sh run ... -t simulation --omni [omniperf flags]
allows one to conduct kernel-level profiling with AMD's Omniperf. When used, --omni
will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run. Adding this argument will moderately slow down the simulation and run the MFC executable several times. For this reason, it should only be used with case files with few timesteps.When running a simulation, MFC generates a ./restart_data
folder in the case directory that contains lustre_*.dat
files that can be used to restart a simulation from saved timesteps. This allows a user to simulate some timestep X, then continue it to run to another timestep Y, where Y > X. The user can also choose to add new patches at the intermediate timestep.
If you want to restart a simulation,
t_step_start
: t_it_step_stop
: t_ft_step_save
: SF in which t_i is the starting time, t_f is the final time, and SF is the saving frequency time. For a simulation that uses adaptive time-stepping, set up the initial case file with:n_start
: t_it_stop
: t_ft_save
: SF in which t_i is the starting time, t_f is the final time, and SF is the saving frequency time.pre_process
and simulation
on the case../mfc.sh run case.py -t pre_process simulation
./restart_data
.restart_case.py
), which should have:m
, n
, and p
:(xyz)_domainbeg
(xyz)_domainend
stretch_(xyz)
a_(xyz)
(xyz)_a
(xyz)_b
When using a constant time-step, alter the following:
t_step_start
: t_s (the point at which the simulation will restart)t_step_stop
: t_{f2} (new final simulation time, which can be the same as t_f)t_step_save
: {SF}_2 (if interested in changing the saving frequency)If using a CFL-based time-step, alter the following:
n_start
: t_s (the save file at which the simulation will restart)t_stop
: t_{f2} (new final simulation time, which can be the same as t_f)t_save
: {SF}_2 (if interested in changing the saving frequency)old_ic
: 'T' (to specify that we have initial conditions from previous simulations)old_grid
: 'T' (to specify that we have a grid from previous simulations)t_step_old
: t_i (the time step used as the t_step_start
of the original case.py
file)num_patches
to reflect the number of ADDED patches in the restart_case.py
file. If no patches are added, set num_patches: 0
case.py
file) removed.patch_icpp(1)all variables
patch_icpp(2)all variables
patch_icpp(num_patches)all variables
patch_icpp(1)some variables of interest
restart_case.py
./mfc.sh run restart_case.py -t pre_process simulation
t_step_start
, t_step_stop
] range, with t_step_save
as the spacing between files.t_step_stop
to the restarting point t_s in case.py
. Then, run the commands below. The first command will run on timesteps [t_i, t_s]. The second command will run on [t_s, t_{f2}]. Therefore, the whole range [t_i, t_{f2}] will be post processed.We have provided an example, case.py
and restart_case.py
in /examples/1D_vacuum_restart/
. This simulation is a duplicate of the 1D_vacuum
case. It demonstrates stopping at timestep 7000, adding a new patch, and restarting the simulation. To test this code, run: