MFC
Exascale flow solver
Loading...
Searching...
No Matches
m_mpi_common.fpp.f90
Go to the documentation of this file.
1# 1 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2!>
3!! @file
4!! @brief Contains module m_mpi_common
5
6# 1 "/home/runner/work/MFC/MFC/src/common/include/case.fpp" 1
7! This file exists so that Fypp can be run without generating case.fpp files for
8! each target. This is useful when generating documentation, for example. This
9! should also let MFC be built with CMake directly, without invoking mfc.sh.
10
11! For pre-process.
12# 9 "/home/runner/work/MFC/MFC/src/common/include/case.fpp"
13
14! For moving immersed boundaries in simulation
15# 14 "/home/runner/work/MFC/MFC/src/common/include/case.fpp"
16# 6 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp" 2
17# 1 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp" 1
18# 1 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 1
19# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
20# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
21# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
22# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
23# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
24# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
25
26# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
27# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
28# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
29
30# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
31
32# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
33
34# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
35
36# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
37
38# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
39
40# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
41
42# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
43! New line at end of file is required for FYPP
44# 2 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
45# 1 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp" 1
46# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
47# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
48# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
49# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
50# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
51# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
52
53# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
54# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
55# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
56
57# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
58
59# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
60
61# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
62
63# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
64
65# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
66
67# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
68
69# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
70! New line at end of file is required for FYPP
71# 2 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp" 2
72
73# 4 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
74# 5 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
75# 6 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
76# 7 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
77# 8 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
78
79# 20 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
80
81# 43 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
82
83# 48 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
84
85# 53 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
86
87# 58 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
88
89# 63 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
90
91# 68 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
92
93# 76 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
94
95# 81 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
96
97# 86 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
98
99# 91 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
100
101# 96 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
102
103# 101 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
104
105# 106 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
106
107# 111 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
108
109# 116 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
110
111# 121 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
112
113# 151 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
114
115# 192 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
116
117# 206 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
118
119# 231 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
120
121# 242 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
122
123# 244 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
124# 255 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
125
126# 284 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
127
128# 294 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
129
130# 304 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
131
132# 313 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
133
134# 330 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
135
136# 340 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
137
138# 347 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
139
140# 353 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
141
142# 359 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
143
144# 365 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
145
146# 371 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
147
148# 377 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
149! New line at end of file is required for FYPP
150# 3 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
151# 1 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp" 1
152# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
153# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
154# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
155# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
156# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
157# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
158
159# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
160# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
161# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
162
163# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
164
165# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
166
167# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
168
169# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
170
171# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
172
173# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
174
175# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
176! New line at end of file is required for FYPP
177# 2 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp" 2
178
179# 7 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
180
181# 17 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
182
183# 22 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
184
185# 27 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
186
187# 32 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
188
189# 37 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
190
191# 42 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
192
193# 47 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
194
195# 52 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
196
197# 57 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
198
199# 62 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
200
201# 73 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
202
203# 78 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
204
205# 83 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
206
207# 88 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
208
209# 103 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
210
211# 131 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
212
213# 160 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
214
215# 175 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
216
217# 193 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
218
219# 215 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
220
221# 244 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
222
223# 259 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
224
225# 269 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
226
227# 278 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
228
229# 294 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
230
231# 304 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
232
233# 311 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
234! New line at end of file is required for FYPP
235# 4 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
236
237! GPU parallel region (scalar reductions, maxval/minval)
238# 23 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
239
240! GPU parallel loop over threads (most common GPU macro)
241# 43 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
242
243! Required closing for GPU_PARALLEL_LOOP
244# 55 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
245
246! Mark routine for device compilation
247# 112 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
248
249! Declare device-resident data
250# 130 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
251
252! Inner loop within a GPU parallel region
253# 145 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
254
255! Scoped GPU data region
256# 164 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
257
258! Host code with device pointers (for MPI with GPU buffers)
259# 193 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
260
261! Allocate device memory (unscoped)
262# 207 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
263
264! Free device memory
265# 219 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
266
267! Atomic operation on device
268# 231 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
269
270! End atomic capture block
271# 242 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
272
273! Copy data between host and device
274# 254 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
275
276! Synchronization barrier
277# 266 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
278
279! Import GPU library module (openacc or omp_lib)
280# 275 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
281
282! Emit code only for AMD compiler
283# 282 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
284
285! Emit code for non-Cray compilers
286# 289 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
287
288! Emit code only for Cray compiler
289# 296 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
290
291! Emit code for non-NVIDIA compilers
292# 303 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
293
294# 305 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
295# 306 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
296! New line at end of file is required for FYPP
297# 2 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp" 2
298
299# 14 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
300
301! Caution: This macro requires the use of a binding script to set CUDA_VISIBLE_DEVICES, such that we have one GPU device per MPI
302! rank. That's because for both cudaMemAdvise (preferred location) and cudaMemPrefetchAsync we use location = device_id = 0. For an
303! example see misc/nvidia_uvm/bind.sh. NVIDIA unified memory page placement hint
304# 57 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
305
306! Allocate and create GPU device memory
307# 77 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
308
309! Free GPU device memory and deallocate
310# 85 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
311
312! Cray-specific GPU pointer setup for vector fields
313# 109 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
314
315! Cray-specific GPU pointer setup for scalar fields
316# 125 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
317
318! Cray-specific GPU pointer setup for acoustic source spatials
319# 150 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
320
321# 156 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
322
323# 163 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
324! New line at end of file is required for FYPP
325# 7 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp" 2
326
327!> @brief MPI communication layer: domain decomposition, halo exchange, reductions, and parallel I/O setup
329
330#ifdef MFC_MPI
331 use mpi !< message passing interface (mpi) module
332#endif
333
336 use m_helper
337 use ieee_arithmetic
338 use m_nvtx
339
340 implicit none
341
342 integer, private :: v_size
343
344# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
345#if defined(MFC_OpenACC)
346# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
347!$acc declare create(v_size)
348# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
349#elif defined(MFC_OpenMP)
350# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
351!$omp declare target (v_size)
352# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
353#endif
354
355 real(wp), private, allocatable, dimension(:) :: buff_send !< Primitive variable send buffer for halo exchange
356 real(wp), private, allocatable, dimension(:) :: buff_recv !< Primitive variable receive buffer for halo exchange
357#ifndef __NVCOMPILER_GPU_UNIFIED_MEM
358
359# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
360#if defined(MFC_OpenACC)
361# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
362!$acc declare create(buff_send, buff_recv)
363# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
364#elif defined(MFC_OpenMP)
365# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
366!$omp declare target (buff_send, buff_recv)
367# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
368#endif
369#endif
370
371 integer(kind=8) :: halo_size
372
373# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
374#if defined(MFC_OpenACC)
375# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
376!$acc declare create(halo_size)
377# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
378#elif defined(MFC_OpenMP)
379# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
380!$omp declare target (halo_size)
381# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
382#endif
383
384contains
385
386 !> Initialize the module.
388
389#ifdef MFC_MPI
390 ! Allocating buff_send/recv and. Please note that for the sake of simplicity, both variables are provided sufficient storage
391 ! to hold the largest buffer in the computational domain.
392
393 if (qbmm .and. .not. polytropic) then
394 v_size = sys_size + 2*nb*nnode
395 else if (chemistry .and. chem_params%diffusion) then
396 v_size = sys_size + 1
397 else
399 end if
400
401 if (n > 0) then
402 if (p > 0) then
403 halo_size = nint(-1._wp + 1._wp*buff_size*(v_size)*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)*(p + 2*buff_size &
404 & + 1)/(cells_bounds%mnp_min + 2*buff_size + 1))
405 else
406 halo_size = -1 + buff_size*(v_size)*(cells_bounds%mn_max + 2*buff_size + 1)
407 end if
408 else
409 halo_size = -1 + buff_size*(v_size)
410 end if
411
412
413# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
414#if defined(MFC_OpenACC)
415# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
416!$acc update device(halo_size, v_size)
417# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
418#elif defined(MFC_OpenMP)
419# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
420!$omp target update to(halo_size, v_size)
421# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
422#endif
423
424#ifndef __NVCOMPILER_GPU_UNIFIED_MEM
425#ifdef MFC_DEBUG
426# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
427 block
428# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
429 use iso_fortran_env, only: output_unit
430# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
431
432# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
433 print *, 'm_mpi_common.fpp:66: ', '@:ALLOCATE(buff_send(0:halo_size), buff_recv(0:halo_size))'
434# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
435
436# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
437 call flush (output_unit)
438# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
439 end block
440# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
441#endif
442# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
443 allocate (buff_send(0:halo_size), buff_recv(0:halo_size))
444# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
445
446# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
447
448# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
449
450# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
451#if defined(MFC_OpenACC)
452# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
453!$acc enter data create(buff_send, buff_recv)
454# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
455#elif defined(MFC_OpenMP)
456# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
457!$omp target enter data map(always,alloc:buff_send, buff_recv)
458# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
459#endif
460#else
461 allocate (buff_send(0:halo_size), buff_recv(0:halo_size))
462
463# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
464#if defined(MFC_OpenACC)
465# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
466!$acc enter data create(capture:buff_send)
467# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
468#elif defined(MFC_OpenMP)
469# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
470!$omp target enter data map(always,alloc:capture:buff_send)
471# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
472#endif
473
474# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
475#if defined(MFC_OpenACC)
476# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
477!$acc enter data create(capture:buff_recv)
478# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
479#elif defined(MFC_OpenMP)
480# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
481!$omp target enter data map(always,alloc:capture:buff_recv)
482# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
483#endif
484#endif
485#endif
486
487 end subroutine s_initialize_mpi_common_module
488
489 !> Initialize the MPI execution environment and query the number of processors and local rank.
490 impure subroutine s_mpi_initialize
491
492#ifdef MFC_MPI
493 integer :: ierr !< Generic flag used to identify and report MPI errors
494
495 call mpi_init(ierr)
496
497 if (ierr /= mpi_success) then
498 print '(A)', 'Unable to initialize MPI environment. Exiting.'
499 call mpi_abort(mpi_comm_world, 1, ierr)
500 end if
501
502 call mpi_comm_size(mpi_comm_world, num_procs, ierr)
503
504 call mpi_comm_rank(mpi_comm_world, proc_rank, ierr)
505#else
506 num_procs = 1
507 proc_rank = 0
508#endif
509
510 end subroutine s_mpi_initialize
511
512 !> Set up MPI I/O data views and variable pointers for parallel file output.
513 impure subroutine s_initialize_mpi_data(q_cons_vf, ib_markers, beta)
514
515 type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
516 type(integer_field), optional, intent(in) :: ib_markers
517 type(scalar_field), intent(in), optional :: beta
518 integer, dimension(num_dims) :: sizes_glb, sizes_loc
519
520#ifdef MFC_MPI
521 integer :: i, j
522 integer :: ierr !< Generic flag used to identify and report MPI errors
523 integer :: alt_sys
524
525 if (present(beta)) then
526 alt_sys = sys_size + 1
527 else
528 alt_sys = sys_size
529 end if
530
531 do i = 1, sys_size
532 mpi_io_data%var(i)%sf => q_cons_vf(i)%sf(0:m,0:n,0:p)
533 end do
534
535 if (present(beta)) then
536 mpi_io_data%var(alt_sys)%sf => beta%sf(0:m,0:n,0:p)
537 end if
538
539 ! Additional variables pb and mv for non-polytropic qbmm
540 if (qbmm .and. .not. polytropic) then
541 do i = 1, nb
542 do j = 1, nnode
543#ifdef MFC_PRE_PROCESS
544 mpi_io_data%var(sys_size + (i - 1)*nnode + j)%sf => pb%sf(0:m,0:n,0:p,j, i)
545 mpi_io_data%var(sys_size + (i - 1)*nnode + j + nb*nnode)%sf => mv%sf(0:m,0:n,0:p,j, i)
546#elif defined (MFC_SIMULATION)
547 mpi_io_data%var(sys_size + (i - 1)*nnode + j)%sf => pb_ts(1)%sf(0:m,0:n,0:p,j, i)
548 mpi_io_data%var(sys_size + (i - 1)*nnode + j + nb*nnode)%sf => mv_ts(1)%sf(0:m,0:n,0:p,j, i)
549#endif
550 end do
551 end do
552 end if
553
554 ! Define global(g) and local(l) sizes for flow variables
555 sizes_glb(1) = m_glb + 1; sizes_loc(1) = m + 1
556 if (n > 0) then
557 sizes_glb(2) = n_glb + 1; sizes_loc(2) = n + 1
558 if (p > 0) then
559 sizes_glb(num_dims) = p_glb + 1; sizes_loc(num_dims) = p + 1
560 end if
561 end if
562
563 ! Define the view for each variable
564 do i = 1, alt_sys
565 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_p, &
566 & mpi_io_data%view(i), ierr)
567 call mpi_type_commit(mpi_io_data%view(i), ierr)
568 end do
569
570#ifndef MFC_POST_PROCESS
571 if (qbmm .and. .not. polytropic) then
572 do i = sys_size + 1, sys_size + 2*nb*nnode
573 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_p, &
574 & mpi_io_data%view(i), ierr)
575 call mpi_type_commit(mpi_io_data%view(i), ierr)
576 end do
577 end if
578#endif
579
580#ifndef MFC_PRE_PROCESS
581 if (present(ib_markers)) then
582 mpi_io_ib_data%var%sf => ib_markers%sf(0:m,0:n,0:p)
583
584 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_integer, &
585 & mpi_io_ib_data%view, ierr)
586 call mpi_type_commit(mpi_io_ib_data%view, ierr)
587 end if
588#endif
589#endif
590
591 end subroutine s_initialize_mpi_data
592
593 !> Set up MPI I/O data views for downsampled (coarsened) parallel file output.
594 subroutine s_initialize_mpi_data_ds(q_cons_vf)
595
596 type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
597 integer, dimension(num_dims) :: sizes_loc
598 integer, dimension(3) :: sf_start_idx
599
600#ifdef MFC_MPI
601 integer :: i, m_ds, n_ds, p_ds, ierr
602
603 sf_start_idx = (/0, 0, 0/)
604
605#ifndef MFC_POST_PROCESS
606 m_ds = int((m + 1)/3) - 1
607 n_ds = int((n + 1)/3) - 1
608 p_ds = int((p + 1)/3) - 1
609#else
610 m_ds = m
611 n_ds = n
612 p_ds = p
613#endif
614
615#ifdef MFC_POST_PROCESS
616 do i = 1, sys_size
617 mpi_io_data%var(i)%sf => q_cons_vf(i)%sf(-1:m_ds + 1,-1:n_ds + 1,-1:p_ds + 1)
618 end do
619#endif
620 ! Define global(g) and local(l) sizes for flow variables
621 sizes_loc(1) = m_ds + 3
622 if (n > 0) then
623 sizes_loc(2) = n_ds + 3
624 if (p > 0) then
625 sizes_loc(num_dims) = p_ds + 3
626 end if
627 end if
628
629 ! Define the view for each variable
630 do i = 1, sys_size
631 call mpi_type_create_subarray(num_dims, sizes_loc, sizes_loc, sf_start_idx, mpi_order_fortran, mpi_p, &
632 & mpi_io_data%view(i), ierr)
633 call mpi_type_commit(mpi_io_data%view(i), ierr)
634 end do
635#endif
636
637 end subroutine s_initialize_mpi_data_ds
638
639 !> Gather variable-length real vectors from all MPI ranks onto the root process.
640 impure subroutine s_mpi_gather_data(my_vector, counts, gathered_vector, root)
641
642 integer, intent(in) :: counts !< Array of vector lengths for each process
643 real(wp), intent(in), dimension(counts) :: my_vector !< Input vector on each process
644 integer, intent(in) :: root !< Rank of the root process
645 real(wp), allocatable, intent(out) :: gathered_vector(:) !< Gathered vector on the root process
646 integer :: i
647 integer :: ierr !< Generic flag used to identify and report MPI errors
648 integer, allocatable :: recounts(:), displs(:)
649
650#ifdef MFC_MPI
651 allocate (recounts(num_procs))
652
653 call mpi_gather(counts, 1, mpi_integer, recounts, 1, mpi_integer, root, mpi_comm_world, ierr)
654
655 allocate (displs(size(recounts)))
656
657 displs(1) = 0
658
659 do i = 2, size(recounts)
660 displs(i) = displs(i - 1) + recounts(i - 1)
661 end do
662
663 allocate (gathered_vector(sum(recounts)))
664 call mpi_gatherv(my_vector, counts, mpi_p, gathered_vector, recounts, displs, mpi_p, root, mpi_comm_world, ierr)
665#endif
666
667 end subroutine s_mpi_gather_data
668
669 !> Gather per-rank time step wall-clock times onto rank 0 for performance reporting.
670 impure subroutine mpi_bcast_time_step_values(proc_time, time_avg)
671
672 real(wp), dimension(0:num_procs - 1), intent(inout) :: proc_time
673 real(wp), intent(inout) :: time_avg
674
675#ifdef MFC_MPI
676 integer :: ierr !< Generic flag used to identify and report MPI errors
677
678 call mpi_gather(time_avg, 1, mpi_p, proc_time(0), 1, mpi_p, 0, mpi_comm_world, ierr)
679#endif
680
681 end subroutine mpi_bcast_time_step_values
682
683 !> Print a case file error with the prohibited condition and message, then abort execution.
684 impure subroutine s_prohibit_abort(condition, message)
685
686 character(len=*), intent(in) :: condition, message
687
688 print *, ""
689 print *, "CASE FILE ERROR"
690 print *, " - Prohibited condition: ", trim(condition)
691 if (len_trim(message) > 0) then
692 print *, " - Note: ", trim(message)
693 end if
694 print *, ""
695 call s_mpi_abort(code=case_file_error_code)
696
697 end subroutine s_prohibit_abort
698
699 !> The goal of this subroutine is to determine the global extrema of the stability criteria in the computational domain. This is
700 !! performed by sifting through the local extrema of each stability criterion. Note that each of the local extrema is from a
701 !! single process, within its assigned section of the computational domain. Finally, note that the global extrema values are
702 !! only bookkeept on the rank 0 processor.
703 impure subroutine s_mpi_reduce_stability_criteria_extrema(icfl_max_loc, vcfl_max_loc, Rc_min_loc, icfl_max_glb, vcfl_max_glb, &
704
705 & Rc_min_glb)
706
707 real(wp), intent(in) :: icfl_max_loc
708 real(wp), intent(in) :: vcfl_max_loc
709 real(wp), intent(in) :: rc_min_loc
710 real(wp), intent(out) :: icfl_max_glb
711 real(wp), intent(out) :: vcfl_max_glb
712 real(wp), intent(out) :: rc_min_glb
713
714 icfl_max_glb = icfl_max_loc
715 vcfl_max_glb = vcfl_max_loc
716 rc_min_glb = rc_min_loc
717
718#ifdef MFC_SIMULATION
719#ifdef MFC_MPI
720 block
721 integer :: ierr
722
723 call mpi_reduce(icfl_max_loc, icfl_max_glb, 1, mpi_p, mpi_max, 0, mpi_comm_world, ierr)
724
725 if (viscous) then
726 call mpi_reduce(vcfl_max_loc, vcfl_max_glb, 1, mpi_p, mpi_max, 0, mpi_comm_world, ierr)
727 call mpi_reduce(rc_min_loc, rc_min_glb, 1, mpi_p, mpi_min, 0, mpi_comm_world, ierr)
728 end if
729 end block
730#else
731 icfl_max_glb = icfl_max_loc
732
733 if (viscous) then
734 vcfl_max_glb = vcfl_max_loc
735 rc_min_glb = rc_min_loc
736 end if
737#endif
738#endif
739
741
742 !> Reduce a local real value to its global sum across all MPI ranks.
743 impure subroutine s_mpi_allreduce_sum(var_loc, var_glb)
744
745 real(wp), intent(in) :: var_loc
746 real(wp), intent(out) :: var_glb
747
748#ifdef MFC_MPI
749 integer :: ierr !< Generic flag used to identify and report MPI errors
750
751 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_sum, mpi_comm_world, ierr)
752#endif
753
754 end subroutine s_mpi_allreduce_sum
755
756 !> Reduce an array of vectors to their global sums across all MPI ranks.
757 impure subroutine s_mpi_allreduce_vectors_sum(var_loc, var_glb, num_vectors, vector_length)
758
759 integer, intent(in) :: num_vectors, vector_length
760 real(wp), dimension(:,:), intent(in) :: var_loc
761 real(wp), dimension(:,:), intent(inout) :: var_glb
762
763#ifdef MFC_MPI
764 integer :: ierr !< Generic flag used to identify and report MPI errors
765
766 if (loc(var_loc) == loc(var_glb)) then
767 call mpi_allreduce(mpi_in_place, var_glb, num_vectors*vector_length, mpi_p, mpi_sum, mpi_comm_world, ierr)
768 else
769 call mpi_allreduce(var_loc, var_glb, num_vectors*vector_length, mpi_p, mpi_sum, mpi_comm_world, ierr)
770 end if
771#else
772 var_glb(1:num_vectors,1:vector_length) = var_loc(1:num_vectors,1:vector_length)
773#endif
774
775 end subroutine s_mpi_allreduce_vectors_sum
776
777 !> Reduce a local integer value to its global sum across all MPI ranks.
778 impure subroutine s_mpi_allreduce_integer_sum(var_loc, var_glb)
779
780 integer, intent(in) :: var_loc
781 integer, intent(out) :: var_glb
782
783#ifdef MFC_MPI
784 integer :: ierr !< Generic flag used to identify and report MPI errors
785
786 call mpi_allreduce(var_loc, var_glb, 1, mpi_integer, mpi_sum, mpi_comm_world, ierr)
787#else
788 var_glb = var_loc
789#endif
790
791 end subroutine s_mpi_allreduce_integer_sum
792
793 !> Reduce a local real value to its global minimum across all MPI ranks.
794 impure subroutine s_mpi_allreduce_min(var_loc, var_glb)
795
796 real(wp), intent(in) :: var_loc
797 real(wp), intent(out) :: var_glb
798
799#ifdef MFC_MPI
800 integer :: ierr !< Generic flag used to identify and report MPI errors
801
802 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_min, mpi_comm_world, ierr)
803#endif
804
805 end subroutine s_mpi_allreduce_min
806
807 !> Reduce a local real value to its global maximum across all MPI ranks.
808 impure subroutine s_mpi_allreduce_max(var_loc, var_glb)
809
810 real(wp), intent(in) :: var_loc
811 real(wp), intent(out) :: var_glb
812
813#ifdef MFC_MPI
814 integer :: ierr !< Generic flag used to identify and report MPI errors
815
816 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_max, mpi_comm_world, ierr)
817#endif
818
819 end subroutine s_mpi_allreduce_max
820
821 !> Reduce a local real value to its global minimum across all ranks
822 impure subroutine s_mpi_reduce_min(var_loc)
823
824 real(wp), intent(inout) :: var_loc
825
826#ifdef MFC_MPI
827 integer :: ierr !< Generic flag used to identify and report MPI errors
828 real(wp) :: var_glb
829
830 call mpi_reduce(var_loc, var_glb, 1, mpi_p, mpi_min, 0, mpi_comm_world, ierr)
831
832 call mpi_bcast(var_glb, 1, mpi_p, 0, mpi_comm_world, ierr)
833
834 var_loc = var_glb
835#endif
836
837 end subroutine s_mpi_reduce_min
838
839 !> Reduce a 2-element variable to its global maximum value with the owning processor rank (MPI_MAXLOC).
840 !> Reduce a local value to its global maximum with location (rank) across all ranks
841 impure subroutine s_mpi_reduce_maxloc(var_loc)
842
843 real(wp), dimension(2), intent(inout) :: var_loc
844
845#ifdef MFC_MPI
846 integer :: ierr !< Generic flag used to identify and report MPI errors
847 real(wp), dimension(2) :: var_glb !< Reduced (max value, rank) pair
848 call mpi_reduce(var_loc, var_glb, 1, mpi_2p, mpi_maxloc, 0, mpi_comm_world, ierr)
849
850 call mpi_bcast(var_glb, 1, mpi_2p, 0, mpi_comm_world, ierr)
851
852 var_loc = var_glb
853#endif
854
855 end subroutine s_mpi_reduce_maxloc
856
857 !> The subroutine terminates the MPI execution environment.
858 impure subroutine s_mpi_abort(prnt, code)
859
860 character(len=*), intent(in), optional :: prnt
861 integer, intent(in), optional :: code
862
863#ifdef MFC_MPI
864 integer :: ierr !< Generic flag used to identify and report MPI errors
865#endif
866
867 if (present(prnt)) then
868 print *, prnt
869 call flush (6)
870 end if
871
872#ifndef MFC_MPI
873 if (present(code)) then
874 stop code
875 else
876 stop 1
877 end if
878#else
879 if (present(code)) then
880 call mpi_abort(mpi_comm_world, code, ierr)
881 else
882 call mpi_abort(mpi_comm_world, 1, ierr)
883 end if
884#endif
885
886 end subroutine s_mpi_abort
887
888 !> Halts all processes until all have reached barrier.
889 impure subroutine s_mpi_barrier
890
891#ifdef MFC_MPI
892 integer :: ierr !< Generic flag used to identify and report MPI errors
893
894 call mpi_barrier(mpi_comm_world, ierr)
895#endif
896
897 end subroutine s_mpi_barrier
898
899 !> The subroutine finalizes the MPI execution environment.
900 impure subroutine s_mpi_finalize
901
902#ifdef MFC_MPI
903 integer :: ierr !< Generic flag used to identify and report MPI errors
904
905 call mpi_finalize(ierr)
906#endif
907
908 end subroutine s_mpi_finalize
909
910 !> The goal of this procedure is to populate the buffers of the cell-average conservative variables by communicating with the
911 !! neighboring processors.
912 subroutine s_mpi_sendrecv_variables_buffers(q_comm, mpi_dir, pbc_loc, nVar, pb_in, mv_in, q_T_sf)
913
914 type(scalar_field), dimension(1:), intent(inout) :: q_comm
915 real(stp), optional, dimension(idwbuff(1)%beg:,idwbuff(2)%beg:,idwbuff(3)%beg:,1:,1:), intent(inout) :: pb_in, mv_in
916 integer, intent(in) :: mpi_dir, pbc_loc, nVar
917 integer :: i, j, k, l, r, q !< Generic loop iterators
918 integer :: buffer_counts(1:3), buffer_count
919 type(int_bounds_info) :: boundary_conditions(1:3)
920 integer :: beg_end(1:2), grid_dims(1:3)
921 integer :: dst_proc, src_proc, recv_tag, send_tag
922 logical :: beg_end_geq_0, qbmm_comm, chem_diff_comm
923 integer :: pack_offset, unpack_offset
924 type(scalar_field), optional, intent(inout) :: q_T_sf
925
926#ifdef MFC_MPI
927 integer :: ierr !< Generic flag used to identify and report MPI errors
928
929 call nvtxstartrange("RHS-COMM-PACKBUF")
930
931 qbmm_comm = .false.
932 chem_diff_comm = .false.
933
934 if (present(pb_in) .and. present(mv_in) .and. qbmm .and. .not. polytropic) then
935 qbmm_comm = .true.
936 v_size = nvar + 2*nb*nnode
937 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
938 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
939 else if (present(q_t_sf) .and. chemistry .and. chem_params%diffusion) then
940 chem_diff_comm = .true.
941 v_size = nvar + 1
942 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
943 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
944 else
945 v_size = nvar
946 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
947 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
948 end if
949
950
951# 537 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
952#if defined(MFC_OpenACC)
953# 537 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
954!$acc update device(v_size)
955# 537 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
956#elif defined(MFC_OpenMP)
957# 537 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
958!$omp target update to(v_size)
959# 537 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
960#endif
961
962 buffer_count = buffer_counts(mpi_dir)
963 boundary_conditions = (/bc_x, bc_y, bc_z/)
964 beg_end = (/boundary_conditions(mpi_dir)%beg, boundary_conditions(mpi_dir)%end/)
965 beg_end_geq_0 = beg_end(max(pbc_loc, 0) - pbc_loc + 1) >= 0
966
967 ! Implements: pbc_loc bc_x >= 0 -> [send/recv]_tag [dst/src]_proc -1 (=0) 0 -> [1,0] [0,0] | 0 0 [1,0] [beg,beg] -1 (=0) 1
968 ! -> [0,0] [1,0] | 0 1 [0,0] [end,beg] +1 (=1) 0 -> [0,1] [1,1] | 1 0 [0,1] [end,end] +1 (=1) 1 -> [1,1] [0,1] | 1 1 [1,1]
969 ! [beg,end]
970
971 send_tag = f_logical_to_int(.not. f_xor(beg_end_geq_0, pbc_loc == 1))
972 recv_tag = f_logical_to_int(pbc_loc == 1)
973
974 dst_proc = beg_end(1 + f_logical_to_int(f_xor(pbc_loc == 1, beg_end_geq_0)))
975 src_proc = beg_end(1 + f_logical_to_int(pbc_loc == 1))
976
977 grid_dims = (/m, n, p/)
978
979 pack_offset = 0
980 if (f_xor(pbc_loc == 1, beg_end_geq_0)) then
981 pack_offset = grid_dims(mpi_dir) - buff_size + 1
982 end if
983
984 unpack_offset = 0
985 if (pbc_loc == 1) then
986 unpack_offset = grid_dims(mpi_dir) + buff_size + 1
987 end if
988
989 ! Pack Buffer to Send
990# 568 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
991 if (mpi_dir == 1) then
992# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
993
994# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
995
996# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
997#if defined(MFC_OpenACC)
998# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
999!$acc parallel loop collapse(4) gang vector default(present) private(r)
1000# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1001#elif defined(MFC_OpenMP)
1002# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1003
1004# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1005
1006# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1007
1008# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1009!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1010# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1011#endif
1012 do l = 0, p
1013 do k = 0, n
1014 do j = 0, buff_size - 1
1015 do i = 1, nvar
1016 r = (i - 1) + v_size*(j + buff_size*(k + (n + 1)*l))
1017 buff_send(r) = real(q_comm(i)%sf(j + pack_offset, k, l), kind=wp)
1018 end do
1019 end do
1020 end do
1021 end do
1022
1023# 581 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1024#if defined(MFC_OpenACC)
1025# 581 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1026!$acc end parallel loop
1027# 581 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1028#elif defined(MFC_OpenMP)
1029# 581 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1030
1031# 581 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1032!$omp end target teams loop
1033# 581 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1034#endif
1035
1036 if (chem_diff_comm) then
1037
1038# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1039
1040# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1041#if defined(MFC_OpenACC)
1042# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1043!$acc parallel loop collapse(3) gang vector default(present) private(r)
1044# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1045#elif defined(MFC_OpenMP)
1046# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1047
1048# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1049
1050# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1051
1052# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1053!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1054# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1055#endif
1056 do l = 0, p
1057 do k = 0, n
1058 do j = 0, buff_size - 1
1059 r = nvar + v_size*(j + buff_size*(k + (n + 1)*l))
1060 buff_send(r) = real(q_t_sf%sf(j + pack_offset, k, l), kind=wp)
1061 end do
1062 end do
1063 end do
1064
1065# 593 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1066#if defined(MFC_OpenACC)
1067# 593 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1068!$acc end parallel loop
1069# 593 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1070#elif defined(MFC_OpenMP)
1071# 593 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1072
1073# 593 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1074!$omp end target teams loop
1075# 593 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1076#endif
1077 end if
1078
1079 if (qbmm_comm) then
1080
1081# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1082
1083# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1084#if defined(MFC_OpenACC)
1085# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1086!$acc parallel loop collapse(4) gang vector default(present) private(r)
1087# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1088#elif defined(MFC_OpenMP)
1089# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1090
1091# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1092
1093# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1094
1095# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1096!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1097# 597 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1098#endif
1099 do l = 0, p
1100 do k = 0, n
1101 do j = 0, buff_size - 1
1102 do i = nvar + 1, nvar + nnode
1103 do q = 1, nb
1104 r = (i - 1) + (q - 1)*nnode + v_size*(j + buff_size*(k + (n + 1)*l))
1105 buff_send(r) = real(pb_in(j + pack_offset, k, l, i - nvar, q), kind=wp)
1106 end do
1107 end do
1108 end do
1109 end do
1110 end do
1111
1112# 610 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1113#if defined(MFC_OpenACC)
1114# 610 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1115!$acc end parallel loop
1116# 610 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1117#elif defined(MFC_OpenMP)
1118# 610 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1119
1120# 610 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1121!$omp end target teams loop
1122# 610 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1123#endif
1124
1125
1126# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1127
1128# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1129#if defined(MFC_OpenACC)
1130# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1131!$acc parallel loop collapse(5) gang vector default(present) private(r)
1132# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1133#elif defined(MFC_OpenMP)
1134# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1135
1136# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1137
1138# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1139
1140# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1141!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1142# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1143#endif
1144 do l = 0, p
1145 do k = 0, n
1146 do j = 0, buff_size - 1
1147 do i = nvar + 1, nvar + nnode
1148 do q = 1, nb
1149 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*(j + buff_size*(k + (n + 1)*l))
1150 buff_send(r) = real(mv_in(j + pack_offset, k, l, i - nvar, q), kind=wp)
1151 end do
1152 end do
1153 end do
1154 end do
1155 end do
1156
1157# 625 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1158#if defined(MFC_OpenACC)
1159# 625 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1160!$acc end parallel loop
1161# 625 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1162#elif defined(MFC_OpenMP)
1163# 625 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1164
1165# 625 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1166!$omp end target teams loop
1167# 625 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1168#endif
1169 end if
1170# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1171 end if
1172# 568 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1173 if (mpi_dir == 2) then
1174# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1175
1176# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1177
1178# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1179#if defined(MFC_OpenACC)
1180# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1181!$acc parallel loop collapse(4) gang vector default(present) private(r)
1182# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1183#elif defined(MFC_OpenMP)
1184# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1185
1186# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1187
1188# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1189
1190# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1191!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1192# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1193#endif
1194 do i = 1, nvar
1195 do l = 0, p
1196 do k = 0, buff_size - 1
1197 do j = -buff_size, m + buff_size
1198 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k + buff_size*l))
1199 buff_send(r) = real(q_comm(i)%sf(j, k + pack_offset, l), kind=wp)
1200 end do
1201 end do
1202 end do
1203 end do
1204
1205# 639 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1206#if defined(MFC_OpenACC)
1207# 639 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1208!$acc end parallel loop
1209# 639 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1210#elif defined(MFC_OpenMP)
1211# 639 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1212
1213# 639 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1214!$omp end target teams loop
1215# 639 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1216#endif
1217
1218 if (chem_diff_comm) then
1219
1220# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1221
1222# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1223#if defined(MFC_OpenACC)
1224# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1225!$acc parallel loop collapse(3) gang vector default(present) private(r)
1226# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1227#elif defined(MFC_OpenMP)
1228# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1229
1230# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1231
1232# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1233
1234# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1235!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1236# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1237#endif
1238 do l = 0, p
1239 do k = 0, buff_size - 1
1240 do j = -buff_size, m + buff_size
1241 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k + buff_size*l))
1242 buff_send(r) = real(q_t_sf%sf(j, k + pack_offset, l), kind=wp)
1243 end do
1244 end do
1245 end do
1246
1247# 651 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1248#if defined(MFC_OpenACC)
1249# 651 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1250!$acc end parallel loop
1251# 651 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1252#elif defined(MFC_OpenMP)
1253# 651 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1254
1255# 651 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1256!$omp end target teams loop
1257# 651 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1258#endif
1259 end if
1260
1261 if (qbmm_comm) then
1262
1263# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1264
1265# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1266#if defined(MFC_OpenACC)
1267# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1268!$acc parallel loop collapse(5) gang vector default(present) private(r)
1269# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1270#elif defined(MFC_OpenMP)
1271# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1272
1273# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1274
1275# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1276
1277# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1278!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1279# 655 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1280#endif
1281 do i = nvar + 1, nvar + nnode
1282 do l = 0, p
1283 do k = 0, buff_size - 1
1284 do j = -buff_size, m + buff_size
1285 do q = 1, nb
1286 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k &
1287 & + buff_size*l))
1288 buff_send(r) = real(pb_in(j, k + pack_offset, l, i - nvar, q), kind=wp)
1289 end do
1290 end do
1291 end do
1292 end do
1293 end do
1294
1295# 669 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1296#if defined(MFC_OpenACC)
1297# 669 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1298!$acc end parallel loop
1299# 669 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1300#elif defined(MFC_OpenMP)
1301# 669 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1302
1303# 669 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1304!$omp end target teams loop
1305# 669 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1306#endif
1307
1308
1309# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1310
1311# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1312#if defined(MFC_OpenACC)
1313# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1314!$acc parallel loop collapse(5) gang vector default(present) private(r)
1315# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1316#elif defined(MFC_OpenMP)
1317# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1318
1319# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1320
1321# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1322
1323# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1324!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1325# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1326#endif
1327 do i = nvar + 1, nvar + nnode
1328 do l = 0, p
1329 do k = 0, buff_size - 1
1330 do j = -buff_size, m + buff_size
1331 do q = 1, nb
1332 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
1333 & + 1)*(k + buff_size*l))
1334 buff_send(r) = real(mv_in(j, k + pack_offset, l, i - nvar, q), kind=wp)
1335 end do
1336 end do
1337 end do
1338 end do
1339 end do
1340
1341# 685 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1342#if defined(MFC_OpenACC)
1343# 685 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1344!$acc end parallel loop
1345# 685 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1346#elif defined(MFC_OpenMP)
1347# 685 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1348
1349# 685 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1350!$omp end target teams loop
1351# 685 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1352#endif
1353 end if
1354# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1355 end if
1356# 568 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1357 if (mpi_dir == 3) then
1358# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1359
1360# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1361
1362# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1363#if defined(MFC_OpenACC)
1364# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1365!$acc parallel loop collapse(4) gang vector default(present) private(r)
1366# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1367#elif defined(MFC_OpenMP)
1368# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1369
1370# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1371
1372# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1373
1374# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1375!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1376# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1377#endif
1378 do i = 1, nvar
1379 do l = 0, buff_size - 1
1380 do k = -buff_size, n + buff_size
1381 do j = -buff_size, m + buff_size
1382 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
1383 & + 2*buff_size + 1)*l))
1384 buff_send(r) = real(q_comm(i)%sf(j, k, l + pack_offset), kind=wp)
1385 end do
1386 end do
1387 end do
1388 end do
1389
1390# 700 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1391#if defined(MFC_OpenACC)
1392# 700 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1393!$acc end parallel loop
1394# 700 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1395#elif defined(MFC_OpenMP)
1396# 700 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1397
1398# 700 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1399!$omp end target teams loop
1400# 700 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1401#endif
1402
1403 if (chem_diff_comm) then
1404
1405# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1406
1407# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1408#if defined(MFC_OpenACC)
1409# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1410!$acc parallel loop collapse(3) gang vector default(present) private(r)
1411# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1412#elif defined(MFC_OpenMP)
1413# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1414
1415# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1416
1417# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1418
1419# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1420!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1421# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1422#endif
1423 do l = 0, buff_size - 1
1424 do k = -buff_size, n + buff_size
1425 do j = -buff_size, m + buff_size
1426 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
1427 & + 2*buff_size + 1)*l))
1428 buff_send(r) = real(q_t_sf%sf(j, k, l + pack_offset), kind=wp)
1429 end do
1430 end do
1431 end do
1432
1433# 713 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1434#if defined(MFC_OpenACC)
1435# 713 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1436!$acc end parallel loop
1437# 713 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1438#elif defined(MFC_OpenMP)
1439# 713 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1440
1441# 713 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1442!$omp end target teams loop
1443# 713 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1444#endif
1445 end if
1446
1447 if (qbmm_comm) then
1448
1449# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1450
1451# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1452#if defined(MFC_OpenACC)
1453# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1454!$acc parallel loop collapse(5) gang vector default(present) private(r)
1455# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1456#elif defined(MFC_OpenMP)
1457# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1458
1459# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1460
1461# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1462
1463# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1464!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1465# 717 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1466#endif
1467 do i = nvar + 1, nvar + nnode
1468 do l = 0, buff_size - 1
1469 do k = -buff_size, n + buff_size
1470 do j = -buff_size, m + buff_size
1471 do q = 1, nb
1472 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
1473 & + buff_size) + (n + 2*buff_size + 1)*l))
1474 buff_send(r) = real(pb_in(j, k, l + pack_offset, i - nvar, q), kind=wp)
1475 end do
1476 end do
1477 end do
1478 end do
1479 end do
1480
1481# 731 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1482#if defined(MFC_OpenACC)
1483# 731 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1484!$acc end parallel loop
1485# 731 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1486#elif defined(MFC_OpenMP)
1487# 731 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1488
1489# 731 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1490!$omp end target teams loop
1491# 731 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1492#endif
1493
1494
1495# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1496
1497# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1498#if defined(MFC_OpenACC)
1499# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1500!$acc parallel loop collapse(5) gang vector default(present) private(r)
1501# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1502#elif defined(MFC_OpenMP)
1503# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1504
1505# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1506
1507# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1508
1509# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1510!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1511# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1512#endif
1513 do i = nvar + 1, nvar + nnode
1514 do l = 0, buff_size - 1
1515 do k = -buff_size, n + buff_size
1516 do j = -buff_size, m + buff_size
1517 do q = 1, nb
1518 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
1519 & + 1)*((k + buff_size) + (n + 2*buff_size + 1)*l))
1520 buff_send(r) = real(mv_in(j, k, l + pack_offset, i - nvar, q), kind=wp)
1521 end do
1522 end do
1523 end do
1524 end do
1525 end do
1526
1527# 747 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1528#if defined(MFC_OpenACC)
1529# 747 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1530!$acc end parallel loop
1531# 747 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1532#elif defined(MFC_OpenMP)
1533# 747 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1534
1535# 747 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1536!$omp end target teams loop
1537# 747 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1538#endif
1539 end if
1540# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1541 end if
1542# 752 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1543 call nvtxendrange ! Packbuf
1544
1545 ! Send/Recv
1546#ifdef MFC_SIMULATION
1547# 757 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1548 if (rdma_mpi .eqv. .false.) then
1549# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1550 call nvtxstartrange("RHS-COMM-DEV2HOST")
1551
1552# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1553#if defined(MFC_OpenACC)
1554# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1555!$acc update host(buff_send)
1556# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1557#elif defined(MFC_OpenMP)
1558# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1559!$omp target update from(buff_send)
1560# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1561#endif
1562 call nvtxendrange
1563 call nvtxstartrange("RHS-COMM-SENDRECV-NO-RMDA")
1564
1565 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1566 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1567
1568 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1569
1570 call nvtxstartrange("RHS-COMM-HOST2DEV")
1571
1572# 780 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1573#if defined(MFC_OpenACC)
1574# 780 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1575!$acc update device(buff_recv)
1576# 780 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1577#elif defined(MFC_OpenMP)
1578# 780 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1579!$omp target update to(buff_recv)
1580# 780 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1581#endif
1582 call nvtxendrange
1583# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1584 end if
1585# 757 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1586 if (rdma_mpi .eqv. .true.) then
1587# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1588
1589# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1590#if defined(MFC_OpenACC)
1591# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1592!$acc host_data use_device(buff_send, buff_recv)
1593# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1594 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1595# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1596
1597# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1598 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1599# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1600 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1601# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1602
1603# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1604 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1605# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1606!$acc end host_data
1607# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1608#elif defined(MFC_OpenMP)
1609# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1610!$omp target data use_device_addr(buff_send, buff_recv)
1611# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1612 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1613# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1614
1615# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1616 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1617# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1618 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1619# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1620
1621# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1622 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1623# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1624!$omp end target data
1625# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1626#else
1627# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1628 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1629# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1630
1631# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1632 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1633# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1634 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1635# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1636
1637# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1638 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1639# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1640#endif
1641# 767 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1642
1643# 767 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1644#if defined(MFC_OpenACC)
1645# 767 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1646!$acc wait
1647# 767 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1648#elif defined(MFC_OpenMP)
1649# 767 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1650!$omp barrier
1651# 767 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1652#endif
1653# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1654 end if
1655# 785 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1656#else
1657 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, src_proc, recv_tag, &
1658 & mpi_comm_world, mpi_status_ignore, ierr)
1659#endif
1660
1661 ! Unpack Received Buffer
1662 call nvtxstartrange("RHS-COMM-UNPACKBUF")
1663# 793 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1664 if (mpi_dir == 1) then
1665# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1666
1667# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1668
1669# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1670#if defined(MFC_OpenACC)
1671# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1672!$acc parallel loop collapse(4) gang vector default(present) private(r)
1673# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1674#elif defined(MFC_OpenMP)
1675# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1676
1677# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1678
1679# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1680
1681# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1682!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1683# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1684#endif
1685 do l = 0, p
1686 do k = 0, n
1687 do j = -buff_size, -1
1688 do i = 1, nvar
1689 r = (i - 1) + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1690 q_comm(i)%sf(j + unpack_offset, k, l) = real(buff_recv(r), kind=stp)
1691#if defined(__INTEL_COMPILER)
1692 if (ieee_is_nan(q_comm(i)%sf(j + unpack_offset, k, l))) then
1693 print *, "Error", j, k, l, i
1694 call s_mpi_abort("NaN(s) in recv")
1695 end if
1696#endif
1697 end do
1698 end do
1699 end do
1700 end do
1701
1702# 812 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1703#if defined(MFC_OpenACC)
1704# 812 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1705!$acc end parallel loop
1706# 812 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1707#elif defined(MFC_OpenMP)
1708# 812 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1709
1710# 812 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1711!$omp end target teams loop
1712# 812 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1713#endif
1714
1715 if (chem_diff_comm) then
1716
1717# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1718
1719# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1720#if defined(MFC_OpenACC)
1721# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1722!$acc parallel loop collapse(3) gang vector default(present) private(r)
1723# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1724#elif defined(MFC_OpenMP)
1725# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1726
1727# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1728
1729# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1730
1731# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1732!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1733# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1734#endif
1735 do l = 0, p
1736 do k = 0, n
1737 do j = -buff_size, -1
1738 r = nvar + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1739 q_t_sf%sf(j + unpack_offset, k, l) = real(buff_recv(r), kind=stp)
1740#if defined(__INTEL_COMPILER)
1741 if (ieee_is_nan(q_t_sf%sf(j + unpack_offset, k, l))) then
1742 print *, "Error", j, k, l
1743 call s_mpi_abort("NaN(s) in recv")
1744 end if
1745#endif
1746 end do
1747 end do
1748 end do
1749
1750# 830 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1751#if defined(MFC_OpenACC)
1752# 830 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1753!$acc end parallel loop
1754# 830 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1755#elif defined(MFC_OpenMP)
1756# 830 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1757
1758# 830 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1759!$omp end target teams loop
1760# 830 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1761#endif
1762 end if
1763
1764 if (qbmm_comm) then
1765
1766# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1767
1768# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1769#if defined(MFC_OpenACC)
1770# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1771!$acc parallel loop collapse(5) gang vector default(present) private(r)
1772# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1773#elif defined(MFC_OpenMP)
1774# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1775
1776# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1777
1778# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1779
1780# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1781!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1782# 834 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1783#endif
1784 do l = 0, p
1785 do k = 0, n
1786 do j = -buff_size, -1
1787 do i = nvar + 1, nvar + nnode
1788 do q = 1, nb
1789 r = (i - 1) + (q - 1)*nnode + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1790 pb_in(j + unpack_offset, k, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1791 end do
1792 end do
1793 end do
1794 end do
1795 end do
1796
1797# 847 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1798#if defined(MFC_OpenACC)
1799# 847 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1800!$acc end parallel loop
1801# 847 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1802#elif defined(MFC_OpenMP)
1803# 847 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1804
1805# 847 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1806!$omp end target teams loop
1807# 847 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1808#endif
1809
1810
1811# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1812
1813# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1814#if defined(MFC_OpenACC)
1815# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1816!$acc parallel loop collapse(5) gang vector default(present) private(r)
1817# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1818#elif defined(MFC_OpenMP)
1819# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1820
1821# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1822
1823# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1824
1825# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1826!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1827# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1828#endif
1829 do l = 0, p
1830 do k = 0, n
1831 do j = -buff_size, -1
1832 do i = nvar + 1, nvar + nnode
1833 do q = 1, nb
1834 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1835 mv_in(j + unpack_offset, k, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1836 end do
1837 end do
1838 end do
1839 end do
1840 end do
1841
1842# 862 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1843#if defined(MFC_OpenACC)
1844# 862 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1845!$acc end parallel loop
1846# 862 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1847#elif defined(MFC_OpenMP)
1848# 862 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1849
1850# 862 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1851!$omp end target teams loop
1852# 862 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1853#endif
1854 end if
1855# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1856 end if
1857# 793 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1858 if (mpi_dir == 2) then
1859# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1860
1861# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1862
1863# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1864#if defined(MFC_OpenACC)
1865# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1866!$acc parallel loop collapse(4) gang vector default(present) private(r)
1867# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1868#elif defined(MFC_OpenMP)
1869# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1870
1871# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1872
1873# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1874
1875# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1876!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1877# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1878#endif
1879 do i = 1, nvar
1880 do l = 0, p
1881 do k = -buff_size, -1
1882 do j = -buff_size, m + buff_size
1883 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + buff_size*l))
1884 q_comm(i)%sf(j, k + unpack_offset, l) = real(buff_recv(r), kind=stp)
1885#if defined(__INTEL_COMPILER)
1886 if (ieee_is_nan(q_comm(i)%sf(j, k + unpack_offset, l))) then
1887 print *, "Error", j, k, l, i
1888 call s_mpi_abort("NaN(s) in recv")
1889 end if
1890#endif
1891 end do
1892 end do
1893 end do
1894 end do
1895
1896# 882 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1897#if defined(MFC_OpenACC)
1898# 882 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1899!$acc end parallel loop
1900# 882 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1901#elif defined(MFC_OpenMP)
1902# 882 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1903
1904# 882 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1905!$omp end target teams loop
1906# 882 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1907#endif
1908
1909 if (chem_diff_comm) then
1910
1911# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1912
1913# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1914#if defined(MFC_OpenACC)
1915# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1916!$acc parallel loop collapse(3) gang vector default(present) private(r)
1917# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1918#elif defined(MFC_OpenMP)
1919# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1920
1921# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1922
1923# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1924
1925# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1926!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1927# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1928#endif
1929 do l = 0, p
1930 do k = -buff_size, -1
1931 do j = -buff_size, m + buff_size
1932 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + buff_size*l))
1933 q_t_sf%sf(j, k + unpack_offset, l) = real(buff_recv(r), kind=stp)
1934#if defined(__INTEL_COMPILER)
1935 if (ieee_is_nan(q_t_sf%sf(j, k + unpack_offset, l))) then
1936 print *, "Error", j, k, l
1937 call s_mpi_abort("NaN(s) in recv")
1938 end if
1939#endif
1940 end do
1941 end do
1942 end do
1943
1944# 900 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1945#if defined(MFC_OpenACC)
1946# 900 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1947!$acc end parallel loop
1948# 900 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1949#elif defined(MFC_OpenMP)
1950# 900 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1951
1952# 900 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1953!$omp end target teams loop
1954# 900 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1955#endif
1956 end if
1957
1958 if (qbmm_comm) then
1959
1960# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1961
1962# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1963#if defined(MFC_OpenACC)
1964# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1965!$acc parallel loop collapse(5) gang vector default(present) private(r)
1966# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1967#elif defined(MFC_OpenMP)
1968# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1969
1970# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1971
1972# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1973
1974# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1975!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1976# 904 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1977#endif
1978 do i = nvar + 1, nvar + nnode
1979 do l = 0, p
1980 do k = -buff_size, -1
1981 do j = -buff_size, m + buff_size
1982 do q = 1, nb
1983 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
1984 & + buff_size) + buff_size*l))
1985 pb_in(j, k + unpack_offset, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1986 end do
1987 end do
1988 end do
1989 end do
1990 end do
1991
1992# 918 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1993#if defined(MFC_OpenACC)
1994# 918 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1995!$acc end parallel loop
1996# 918 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1997#elif defined(MFC_OpenMP)
1998# 918 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1999
2000# 918 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2001!$omp end target teams loop
2002# 918 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2003#endif
2004
2005
2006# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2007
2008# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2009#if defined(MFC_OpenACC)
2010# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2011!$acc parallel loop collapse(5) gang vector default(present) private(r)
2012# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2013#elif defined(MFC_OpenMP)
2014# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2015
2016# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2017
2018# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2019
2020# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2021!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2022# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2023#endif
2024 do i = nvar + 1, nvar + nnode
2025 do l = 0, p
2026 do k = -buff_size, -1
2027 do j = -buff_size, m + buff_size
2028 do q = 1, nb
2029 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
2030 & + 1)*((k + buff_size) + buff_size*l))
2031 mv_in(j, k + unpack_offset, l, i - nvar, q) = real(buff_recv(r), kind=stp)
2032 end do
2033 end do
2034 end do
2035 end do
2036 end do
2037
2038# 934 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2039#if defined(MFC_OpenACC)
2040# 934 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2041!$acc end parallel loop
2042# 934 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2043#elif defined(MFC_OpenMP)
2044# 934 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2045
2046# 934 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2047!$omp end target teams loop
2048# 934 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2049#endif
2050 end if
2051# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2052 end if
2053# 793 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2054 if (mpi_dir == 3) then
2055# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2056
2057# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2058
2059# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2060#if defined(MFC_OpenACC)
2061# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2062!$acc parallel loop collapse(4) gang vector default(present) private(r)
2063# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2064#elif defined(MFC_OpenMP)
2065# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2066
2067# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2068
2069# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2070
2071# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2072!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2073# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2074#endif
2075 do i = 1, nvar
2076 do l = -buff_size, -1
2077 do k = -buff_size, n + buff_size
2078 do j = -buff_size, m + buff_size
2079 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
2080 & + 2*buff_size + 1)*(l + buff_size)))
2081 q_comm(i)%sf(j, k, l + unpack_offset) = real(buff_recv(r), kind=stp)
2082#if defined(__INTEL_COMPILER)
2083 if (ieee_is_nan(q_comm(i)%sf(j, k, l + unpack_offset))) then
2084 print *, "Error", j, k, l, i
2085 call s_mpi_abort("NaN(s) in recv")
2086 end if
2087#endif
2088 end do
2089 end do
2090 end do
2091 end do
2092
2093# 955 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2094#if defined(MFC_OpenACC)
2095# 955 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2096!$acc end parallel loop
2097# 955 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2098#elif defined(MFC_OpenMP)
2099# 955 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2100
2101# 955 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2102!$omp end target teams loop
2103# 955 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2104#endif
2105
2106 if (chem_diff_comm) then
2107
2108# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2109
2110# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2111#if defined(MFC_OpenACC)
2112# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2113!$acc parallel loop collapse(3) gang vector default(present) private(r)
2114# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2115#elif defined(MFC_OpenMP)
2116# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2117
2118# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2119
2120# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2121
2122# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2123!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2124# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2125#endif
2126 do l = -buff_size, -1
2127 do k = -buff_size, n + buff_size
2128 do j = -buff_size, m + buff_size
2129 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
2130 & + 2*buff_size + 1)*(l + buff_size)))
2131 q_t_sf%sf(j, k, l + unpack_offset) = real(buff_recv(r), kind=stp)
2132#if defined(__INTEL_COMPILER)
2133 if (ieee_is_nan(q_t_sf%sf(j, k, l + unpack_offset))) then
2134 print *, "Error", j, k, l
2135 call s_mpi_abort("NaN(s) in recv")
2136 end if
2137#endif
2138 end do
2139 end do
2140 end do
2141
2142# 974 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2143#if defined(MFC_OpenACC)
2144# 974 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2145!$acc end parallel loop
2146# 974 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2147#elif defined(MFC_OpenMP)
2148# 974 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2149
2150# 974 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2151!$omp end target teams loop
2152# 974 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2153#endif
2154 end if
2155
2156 if (qbmm_comm) then
2157
2158# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2159
2160# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2161#if defined(MFC_OpenACC)
2162# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2163!$acc parallel loop collapse(5) gang vector default(present) private(r)
2164# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2165#elif defined(MFC_OpenMP)
2166# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2167
2168# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2169
2170# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2171
2172# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2173!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2174# 978 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2175#endif
2176 do i = nvar + 1, nvar + nnode
2177 do l = -buff_size, -1
2178 do k = -buff_size, n + buff_size
2179 do j = -buff_size, m + buff_size
2180 do q = 1, nb
2181 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
2182 & + buff_size) + (n + 2*buff_size + 1)*(l + buff_size)))
2183 pb_in(j, k, l + unpack_offset, i - nvar, q) = real(buff_recv(r), kind=stp)
2184 end do
2185 end do
2186 end do
2187 end do
2188 end do
2189
2190# 992 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2191#if defined(MFC_OpenACC)
2192# 992 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2193!$acc end parallel loop
2194# 992 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2195#elif defined(MFC_OpenMP)
2196# 992 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2197
2198# 992 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2199!$omp end target teams loop
2200# 992 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2201#endif
2202
2203
2204# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2205
2206# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2207#if defined(MFC_OpenACC)
2208# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2209!$acc parallel loop collapse(5) gang vector default(present) private(r)
2210# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2211#elif defined(MFC_OpenMP)
2212# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2213
2214# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2215
2216# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2217
2218# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2219!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2220# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2221#endif
2222 do i = nvar + 1, nvar + nnode
2223 do l = -buff_size, -1
2224 do k = -buff_size, n + buff_size
2225 do j = -buff_size, m + buff_size
2226 do q = 1, nb
2227 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
2228 & + 1)*((k + buff_size) + (n + 2*buff_size + 1)*(l + buff_size)))
2229 mv_in(j, k, l + unpack_offset, i - nvar, q) = real(buff_recv(r), kind=stp)
2230 end do
2231 end do
2232 end do
2233 end do
2234 end do
2235
2236# 1008 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2237#if defined(MFC_OpenACC)
2238# 1008 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2239!$acc end parallel loop
2240# 1008 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2241#elif defined(MFC_OpenMP)
2242# 1008 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2243
2244# 1008 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2245!$omp end target teams loop
2246# 1008 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2247#endif
2248 end if
2249# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2250 end if
2251# 1013 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2252 call nvtxendrange
2253#endif
2254
2256
2257 !> Decompose the computational domain among processors by balancing cells per rank in each coordinate direction.
2259
2260#ifdef MFC_MPI
2261 integer :: num_procs_x, num_procs_y, num_procs_z !< Optimal number of processors in the x-, y- and z-directions
2262 !> Non-optimal number of processors in the x-, y- and z-directions
2263 real(wp) :: tmp_num_procs_x, tmp_num_procs_y, tmp_num_procs_z
2264 real(wp) :: fct_min !< Processor factorization (fct) minimization parameter
2265 integer :: MPI_COMM_CART !< Cartesian processor topology communicator
2266 integer :: rem_cells !< Remaining cells after distribution among processors
2267 integer :: recon_order !< WENO or MUSCL reconstruction order
2268 integer :: i, j !< Generic loop iterators
2269 integer :: ierr !< Generic flag used to identify and report MPI errors
2270
2271 if (recon_type == weno_type) then
2272 recon_order = weno_order
2273 else
2274 recon_order = muscl_order
2275 end if
2276
2277 if (num_procs == 1 .and. parallel_io) then
2278 do i = 1, num_dims
2279 start_idx(i) = 0
2280 end do
2281 return
2282 end if
2283
2284 if (igr) then
2285 recon_order = igr_order
2286 end if
2287
2288 ! 3D Cartesian Processor Topology
2289 if (n > 0) then
2290 if (p > 0) then
2291 if (fft_wrt) then
2292 ! Initial estimate of optimal processor topology
2293 num_procs_x = 1
2294 num_procs_y = 1
2295 num_procs_z = num_procs
2296 ierr = -1
2297
2298 ! Benchmarking the quality of this initial guess
2299 tmp_num_procs_y = num_procs_y
2300 tmp_num_procs_z = num_procs_z
2301 fct_min = 10._wp*abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2302
2303 ! Optimization of the initial processor topology
2304 do i = 1, num_procs
2305 if (mod(num_procs, i) == 0 .and. (n + 1)/i >= num_stcls_min*recon_order) then
2306 tmp_num_procs_y = i
2307 tmp_num_procs_z = num_procs/i
2308
2309 if (fct_min >= abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z) .and. (p + 1) &
2310 & /tmp_num_procs_z >= num_stcls_min*recon_order) then
2311 num_procs_y = i
2312 num_procs_z = num_procs/i
2313 fct_min = abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2314 ierr = 0
2315 end if
2316 end if
2317 end do
2318 else
2319 if (cyl_coord .and. p > 0) then
2320 ! Pencil blocking for cylindrical coordinates (Fourier filter near axis)
2321
2322 ! Initial values of the processor factorization optimization
2323 num_procs_x = 1
2324 num_procs_y = num_procs
2325 num_procs_z = 1
2326 ierr = -1
2327
2328 ! Computing minimization variable for these initial values
2329 tmp_num_procs_x = num_procs_x
2330 tmp_num_procs_y = num_procs_y
2331 tmp_num_procs_z = num_procs_z
2332 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2333
2334 ! Searching for optimal computational domain distribution
2335 do i = 1, num_procs
2336 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2337 tmp_num_procs_x = i
2338 tmp_num_procs_y = num_procs/i
2339
2340 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) .and. (n + 1) &
2341 & /tmp_num_procs_y >= num_stcls_min*recon_order) then
2342 num_procs_x = i
2343 num_procs_y = num_procs/i
2344 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2345 ierr = 0
2346 end if
2347 end if
2348 end do
2349 else
2350 ! Initial estimate of optimal processor topology
2351 num_procs_x = 1
2352 num_procs_y = 1
2353 num_procs_z = num_procs
2354 ierr = -1
2355
2356 ! Benchmarking the quality of this initial guess
2357 tmp_num_procs_x = num_procs_x
2358 tmp_num_procs_y = num_procs_y
2359 tmp_num_procs_z = num_procs_z
2360 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + 10._wp*abs((n + 1) &
2361 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2362
2363 ! Optimization of the initial processor topology
2364 do i = 1, num_procs
2365 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2366 do j = 1, num_procs/i
2367 if (mod(num_procs/i, j) == 0 .and. (n + 1)/j >= num_stcls_min*recon_order) then
2368 tmp_num_procs_x = i
2369 tmp_num_procs_y = j
2370 tmp_num_procs_z = num_procs/(i*j)
2371
2372 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + abs((n + 1) &
2373 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z) .and. (p + 1) &
2374 & /tmp_num_procs_z >= num_stcls_min*recon_order) then
2375 num_procs_x = i
2376 num_procs_y = j
2377 num_procs_z = num_procs/(i*j)
2378 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + abs((n + 1) &
2379 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2380 ierr = 0
2381 end if
2382 end if
2383 end do
2384 end if
2385 end do
2386 end if
2387 end if
2388
2389 ! Verifying that a valid decomposition of the computational domain has been established. If not, the simulation
2390 ! exits.
2391 if (proc_rank == 0 .and. ierr == -1) then
2392 call s_mpi_abort('Unsupported combination of values ' // 'of num_procs, m, n, p and ' &
2393 & // 'weno/muscl/igr_order. Exiting.')
2394 end if
2395
2396 ! Creating new communicator using the Cartesian topology
2397 call mpi_cart_create(mpi_comm_world, 3, (/num_procs_x, num_procs_y, num_procs_z/), (/.true., .true., .true./), &
2398 & .false., mpi_comm_cart, ierr)
2399
2400 ! Finding the Cartesian coordinates of the local process
2401 call mpi_cart_coords(mpi_comm_cart, proc_rank, 3, proc_coords, ierr)
2402
2403 ! Global Parameters for z-direction
2404
2405 ! Number of remaining cells
2406 rem_cells = mod(p + 1, num_procs_z)
2407
2408 ! Optimal number of cells per processor
2409 p = (p + 1)/num_procs_z - 1
2410
2411 ! Distributing the remaining cells
2412 do i = 1, rem_cells
2413 if (proc_coords(3) == i - 1) then
2414 p = p + 1; exit
2415 end if
2416 end do
2417
2418 ! Boundary condition at the beginning
2419 if (proc_coords(3) > 0 .or. (bc_z%beg == bc_periodic .and. num_procs_z > 1)) then
2420 proc_coords(3) = proc_coords(3) - 1
2421 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_z%beg, ierr)
2422 proc_coords(3) = proc_coords(3) + 1
2423 end if
2424
2425 ! Boundary condition at the end
2426 if (proc_coords(3) < num_procs_z - 1 .or. (bc_z%end == bc_periodic .and. num_procs_z > 1)) then
2427 proc_coords(3) = proc_coords(3) + 1
2428 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_z%end, ierr)
2429 proc_coords(3) = proc_coords(3) - 1
2430 end if
2431
2432#ifdef MFC_POST_PROCESS
2433 ! Ghost zone at the beginning
2434 if (proc_coords(3) > 0 .and. format == 1) then
2435 offset_z%beg = 2
2436 else
2437 offset_z%beg = 0
2438 end if
2439
2440 ! Ghost zone at the end
2441 if (proc_coords(3) < num_procs_z - 1 .and. format == 1) then
2442 offset_z%end = 2
2443 else
2444 offset_z%end = 0
2445 end if
2446#endif
2447
2448 ! Beginning and end sub-domain boundary locations
2449 if (parallel_io) then
2450 if (proc_coords(3) < rem_cells) then
2451 start_idx(3) = (p + 1)*proc_coords(3)
2452 else
2453 start_idx(3) = (p + 1)*proc_coords(3) + rem_cells
2454 end if
2455 else
2456#ifdef MFC_PRE_PROCESS
2457 if (old_grid .neqv. .true.) then
2458 dz = (z_domain%end - z_domain%beg)/real(p_glb + 1, wp)
2459
2460 if (proc_coords(3) < rem_cells) then
2461 z_domain%beg = z_domain%beg + dz*real((p + 1)*proc_coords(3))
2462 z_domain%end = z_domain%end - dz*real((p + 1)*(num_procs_z - proc_coords(3) - 1) - (num_procs_z &
2463 & - rem_cells))
2464 else
2465 z_domain%beg = z_domain%beg + dz*real((p + 1)*proc_coords(3) + rem_cells)
2466 z_domain%end = z_domain%end - dz*real((p + 1)*(num_procs_z - proc_coords(3) - 1))
2467 end if
2468 end if
2469#endif
2470 end if
2471
2472 ! 2D Cartesian Processor Topology
2473 else
2474 ! Initial estimate of optimal processor topology
2475 num_procs_x = 1
2476 num_procs_y = num_procs
2477 ierr = -1
2478
2479 ! Benchmarking the quality of this initial guess
2480 tmp_num_procs_x = num_procs_x
2481 tmp_num_procs_y = num_procs_y
2482 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2483
2484 ! Optimization of the initial processor topology
2485 do i = 1, num_procs
2486 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2487 tmp_num_procs_x = i
2488 tmp_num_procs_y = num_procs/i
2489
2490 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) .and. (n + 1) &
2491 & /tmp_num_procs_y >= num_stcls_min*recon_order) then
2492 num_procs_x = i
2493 num_procs_y = num_procs/i
2494 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2495 ierr = 0
2496 end if
2497 end if
2498 end do
2499
2500 ! Verifying that a valid decomposition of the computational domain has been established. If not, the simulation
2501 ! exits.
2502 if (proc_rank == 0 .and. ierr == -1) then
2503 call s_mpi_abort('Unsupported combination of values ' // 'of num_procs, m, n and ' &
2504 & // 'weno/muscl/igr_order. Exiting.')
2505 end if
2506
2507 ! Creating new communicator using the Cartesian topology
2508 call mpi_cart_create(mpi_comm_world, 2, (/num_procs_x, num_procs_y/), (/.true., .true./), .false., mpi_comm_cart, &
2509 & ierr)
2510
2511 ! Finding the Cartesian coordinates of the local process
2512 call mpi_cart_coords(mpi_comm_cart, proc_rank, 2, proc_coords, ierr)
2513 end if
2514
2515 ! Global Parameters for y-direction
2516
2517 ! Number of remaining cells
2518 rem_cells = mod(n + 1, num_procs_y)
2519
2520 ! Optimal number of cells per processor
2521 n = (n + 1)/num_procs_y - 1
2522
2523 ! Distributing the remaining cells
2524 do i = 1, rem_cells
2525 if (proc_coords(2) == i - 1) then
2526 n = n + 1; exit
2527 end if
2528 end do
2529
2530 ! Boundary condition at the beginning
2531 if (proc_coords(2) > 0 .or. (bc_y%beg == bc_periodic .and. num_procs_y > 1)) then
2532 proc_coords(2) = proc_coords(2) - 1
2533 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_y%beg, ierr)
2534 proc_coords(2) = proc_coords(2) + 1
2535 end if
2536
2537 ! Boundary condition at the end
2538 if (proc_coords(2) < num_procs_y - 1 .or. (bc_y%end == bc_periodic .and. num_procs_y > 1)) then
2539 proc_coords(2) = proc_coords(2) + 1
2540 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_y%end, ierr)
2541 proc_coords(2) = proc_coords(2) - 1
2542 end if
2543
2544#ifdef MFC_POST_PROCESS
2545 ! Ghost zone at the beginning
2546 if (proc_coords(2) > 0 .and. format == 1) then
2547 offset_y%beg = 2
2548 else
2549 offset_y%beg = 0
2550 end if
2551
2552 ! Ghost zone at the end
2553 if (proc_coords(2) < num_procs_y - 1 .and. format == 1) then
2554 offset_y%end = 2
2555 else
2556 offset_y%end = 0
2557 end if
2558#endif
2559
2560 ! Beginning and end sub-domain boundary locations
2561 if (parallel_io) then
2562 if (proc_coords(2) < rem_cells) then
2563 start_idx(2) = (n + 1)*proc_coords(2)
2564 else
2565 start_idx(2) = (n + 1)*proc_coords(2) + rem_cells
2566 end if
2567 else
2568#ifdef MFC_PRE_PROCESS
2569 if (old_grid .neqv. .true.) then
2570 dy = (y_domain%end - y_domain%beg)/real(n_glb + 1, wp)
2571
2572 if (proc_coords(2) < rem_cells) then
2573 y_domain%beg = y_domain%beg + dy*real((n + 1)*proc_coords(2))
2574 y_domain%end = y_domain%end - dy*real((n + 1)*(num_procs_y - proc_coords(2) - 1) - (num_procs_y &
2575 & - rem_cells))
2576 else
2577 y_domain%beg = y_domain%beg + dy*real((n + 1)*proc_coords(2) + rem_cells)
2578 y_domain%end = y_domain%end - dy*real((n + 1)*(num_procs_y - proc_coords(2) - 1))
2579 end if
2580 end if
2581#endif
2582 end if
2583
2584 ! 1D Cartesian Processor Topology
2585 else
2586 ! Optimal processor topology
2587 num_procs_x = num_procs
2588
2589 ! Creating new communicator using the Cartesian topology
2590 call mpi_cart_create(mpi_comm_world, 1, (/num_procs_x/), (/.true./), .false., mpi_comm_cart, ierr)
2591
2592 ! Finding the Cartesian coordinates of the local process
2593 call mpi_cart_coords(mpi_comm_cart, proc_rank, 1, proc_coords, ierr)
2594 end if
2595
2596 ! Global Parameters for x-direction
2597
2598 ! Number of remaining cells
2599 rem_cells = mod(m + 1, num_procs_x)
2600
2601 ! Optimal number of cells per processor
2602 m = (m + 1)/num_procs_x - 1
2603
2604 ! Distributing the remaining cells
2605 do i = 1, rem_cells
2606 if (proc_coords(1) == i - 1) then
2607 m = m + 1; exit
2608 end if
2609 end do
2610
2611 call s_update_cell_bounds(cells_bounds, m, n, p)
2612
2613 ! Boundary condition at the beginning
2614 if (proc_coords(1) > 0 .or. (bc_x%beg == bc_periodic .and. num_procs_x > 1)) then
2615 proc_coords(1) = proc_coords(1) - 1
2616 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_x%beg, ierr)
2617 proc_coords(1) = proc_coords(1) + 1
2618 end if
2619
2620 ! Boundary condition at the end
2621 if (proc_coords(1) < num_procs_x - 1 .or. (bc_x%end == bc_periodic .and. num_procs_x > 1)) then
2622 proc_coords(1) = proc_coords(1) + 1
2623 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_x%end, ierr)
2624 proc_coords(1) = proc_coords(1) - 1
2625 end if
2626
2627#ifdef MFC_POST_PROCESS
2628 ! Ghost zone at the beginning
2629 if (proc_coords(1) > 0 .and. format == 1) then
2630 offset_x%beg = 2
2631 else
2632 offset_x%beg = 0
2633 end if
2634
2635 ! Ghost zone at the end
2636 if (proc_coords(1) < num_procs_x - 1 .and. format == 1) then
2637 offset_x%end = 2
2638 else
2639 offset_x%end = 0
2640 end if
2641#endif
2642
2643 ! Beginning and end sub-domain boundary locations
2644 if (parallel_io) then
2645 if (proc_coords(1) < rem_cells) then
2646 start_idx(1) = (m + 1)*proc_coords(1)
2647 else
2648 start_idx(1) = (m + 1)*proc_coords(1) + rem_cells
2649 end if
2650 else
2651#ifdef MFC_PRE_PROCESS
2652 if (old_grid .neqv. .true.) then
2653 dx = (x_domain%end - x_domain%beg)/real(m_glb + 1, wp)
2654
2655 if (proc_coords(1) < rem_cells) then
2656 x_domain%beg = x_domain%beg + dx*real((m + 1)*proc_coords(1))
2657 x_domain%end = x_domain%end - dx*real((m + 1)*(num_procs_x - proc_coords(1) - 1) - (num_procs_x - rem_cells))
2658 else
2659 x_domain%beg = x_domain%beg + dx*real((m + 1)*proc_coords(1) + rem_cells)
2660 x_domain%end = x_domain%end - dx*real((m + 1)*(num_procs_x - proc_coords(1) - 1))
2661 end if
2662 end if
2663#endif
2664 end if
2665#endif
2666
2668
2669 !> The goal of this procedure is to populate the buffers of the grid variables by communicating with the neighboring processors.
2670 !! Note that only the buffers of the cell-width distributions are handled in such a way. This is because the buffers of
2671 !! cell-boundary locations may be calculated directly from those of the cell-width distributions.
2672#ifndef MFC_PRE_PROCESS
2673 subroutine s_mpi_sendrecv_grid_variables_buffers(mpi_dir, pbc_loc)
2674
2675 integer, intent(in) :: mpi_dir
2676 integer, intent(in) :: pbc_loc
2677
2678#ifdef MFC_MPI
2679 integer :: ierr !< Generic flag used to identify and report MPI errors
2680
2681 if (mpi_dir == 1) then
2682 if (pbc_loc == -1) then ! PBC at the beginning
2683
2684 if (bc_x%end >= 0) then ! PBC at the beginning and end
2685 call mpi_sendrecv(dx(m - buff_size + 1), buff_size, mpi_p, bc_x%end, 0, dx(-buff_size), buff_size, mpi_p, &
2686 & bc_x%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2687 else ! PBC at the beginning only
2688 call mpi_sendrecv(dx(0), buff_size, mpi_p, bc_x%beg, 1, dx(-buff_size), buff_size, mpi_p, bc_x%beg, 0, &
2689 & mpi_comm_world, mpi_status_ignore, ierr)
2690 end if
2691 else ! PBC at the end
2692 if (bc_x%beg >= 0) then ! PBC at the end and beginning
2693 call mpi_sendrecv(dx(0), buff_size, mpi_p, bc_x%beg, 1, dx(m + 1), buff_size, mpi_p, bc_x%end, 1, &
2694 & mpi_comm_world, mpi_status_ignore, ierr)
2695 else ! PBC at the end only
2696 call mpi_sendrecv(dx(m - buff_size + 1), buff_size, mpi_p, bc_x%end, 0, dx(m + 1), buff_size, mpi_p, &
2697 & bc_x%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2698 end if
2699 end if
2700 else if (mpi_dir == 2) then
2701 if (pbc_loc == -1) then ! PBC at the beginning
2702
2703 if (bc_y%end >= 0) then ! PBC at the beginning and end
2704 call mpi_sendrecv(dy(n - buff_size + 1), buff_size, mpi_p, bc_y%end, 0, dy(-buff_size), buff_size, mpi_p, &
2705 & bc_y%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2706 else ! PBC at the beginning only
2707 call mpi_sendrecv(dy(0), buff_size, mpi_p, bc_y%beg, 1, dy(-buff_size), buff_size, mpi_p, bc_y%beg, 0, &
2708 & mpi_comm_world, mpi_status_ignore, ierr)
2709 end if
2710 else ! PBC at the end
2711 if (bc_y%beg >= 0) then ! PBC at the end and beginning
2712 call mpi_sendrecv(dy(0), buff_size, mpi_p, bc_y%beg, 1, dy(n + 1), buff_size, mpi_p, bc_y%end, 1, &
2713 & mpi_comm_world, mpi_status_ignore, ierr)
2714 else ! PBC at the end only
2715 call mpi_sendrecv(dy(n - buff_size + 1), buff_size, mpi_p, bc_y%end, 0, dy(n + 1), buff_size, mpi_p, &
2716 & bc_y%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2717 end if
2718 end if
2719 else
2720 if (pbc_loc == -1) then ! PBC at the beginning
2721
2722 if (bc_z%end >= 0) then ! PBC at the beginning and end
2723 call mpi_sendrecv(dz(p - buff_size + 1), buff_size, mpi_p, bc_z%end, 0, dz(-buff_size), buff_size, mpi_p, &
2724 & bc_z%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2725 else ! PBC at the beginning only
2726 call mpi_sendrecv(dz(0), buff_size, mpi_p, bc_z%beg, 1, dz(-buff_size), buff_size, mpi_p, bc_z%beg, 0, &
2727 & mpi_comm_world, mpi_status_ignore, ierr)
2728 end if
2729 else ! PBC at the end
2730 if (bc_z%beg >= 0) then ! PBC at the end and beginning
2731 call mpi_sendrecv(dz(0), buff_size, mpi_p, bc_z%beg, 1, dz(p + 1), buff_size, mpi_p, bc_z%end, 1, &
2732 & mpi_comm_world, mpi_status_ignore, ierr)
2733 else ! PBC at the end only
2734 call mpi_sendrecv(dz(p - buff_size + 1), buff_size, mpi_p, bc_z%end, 0, dz(p + 1), buff_size, mpi_p, &
2735 & bc_z%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2736 end if
2737 end if
2738 end if
2739#endif
2740
2742#endif
2743
2744 !> Module deallocation and/or disassociation procedures
2746
2747#ifdef MFC_MPI
2748 deallocate (buff_send, buff_recv)
2749#endif
2750
2751 end subroutine s_finalize_mpi_common_module
2752
2753end module m_mpi_common
type(scalar_field), dimension(sys_size), intent(inout) q_cons_vf
integer, intent(in) j
Shared derived types for field data, patch geometry, bubble dynamics, and MPI I/O structures.
Defines global parameters for the computational domain, simulation algorithm, and initial conditions.
logical, parameter chemistry
Chemistry modeling.
integer sys_size
Number of unknowns in the system of equations.
integer buff_size
Number of ghost cells for boundary condition storage.
type(chemistry_parameters) chem_params
logical qbmm
Quadrature moment method.
type(cell_num_bounds) cells_bounds
Utility routines for bubble model setup, coordinate transforms, array sampling, and special functions...
MPI communication layer: domain decomposition, halo exchange, reductions, and parallel I/O setup.
impure subroutine s_mpi_abort(prnt, code)
The subroutine terminates the MPI execution environment.
impure subroutine s_initialize_mpi_common_module
Initialize the module.
impure subroutine s_mpi_gather_data(my_vector, counts, gathered_vector, root)
Gather variable-length real vectors from all MPI ranks onto the root process.
impure subroutine s_mpi_barrier
Halts all processes until all have reached barrier.
impure subroutine s_mpi_initialize
Initialize the MPI execution environment and query the number of processors and local rank.
impure subroutine s_mpi_allreduce_vectors_sum(var_loc, var_glb, num_vectors, vector_length)
Reduce an array of vectors to their global sums across all MPI ranks.
real(wp), dimension(:), allocatable, private buff_recv
Primitive variable receive buffer for halo exchange.
impure subroutine s_initialize_mpi_data(q_cons_vf, ib_markers, beta)
Set up MPI I/O data views and variable pointers for parallel file output.
impure subroutine s_mpi_reduce_maxloc(var_loc)
Reduce a 2-element variable to its global maximum value with the owning processor rank (MPI_MAXLOC)....
subroutine s_mpi_sendrecv_grid_variables_buffers(mpi_dir, pbc_loc)
The goal of this procedure is to populate the buffers of the grid variables by communicating with the...
impure subroutine s_mpi_reduce_stability_criteria_extrema(icfl_max_loc, vcfl_max_loc, rc_min_loc, icfl_max_glb, vcfl_max_glb, rc_min_glb)
The goal of this subroutine is to determine the global extrema of the stability criteria in the compu...
impure subroutine s_mpi_allreduce_sum(var_loc, var_glb)
Reduce a local real value to its global sum across all MPI ranks.
real(wp), dimension(:), allocatable, private buff_send
Primitive variable send buffer for halo exchange.
impure subroutine s_mpi_allreduce_min(var_loc, var_glb)
Reduce a local real value to its global minimum across all MPI ranks.
impure subroutine s_prohibit_abort(condition, message)
Print a case file error with the prohibited condition and message, then abort execution.
impure subroutine s_mpi_finalize
The subroutine finalizes the MPI execution environment.
subroutine s_initialize_mpi_data_ds(q_cons_vf)
Set up MPI I/O data views for downsampled (coarsened) parallel file output.
impure subroutine s_mpi_allreduce_max(var_loc, var_glb)
Reduce a local real value to its global maximum across all MPI ranks.
impure subroutine s_mpi_allreduce_integer_sum(var_loc, var_glb)
Reduce a local integer value to its global sum across all MPI ranks.
subroutine s_mpi_sendrecv_variables_buffers(q_comm, mpi_dir, pbc_loc, nvar, pb_in, mv_in, q_t_sf)
The goal of this procedure is to populate the buffers of the cell-average conservative variables by c...
integer, private v_size
impure subroutine mpi_bcast_time_step_values(proc_time, time_avg)
Gather per-rank time step wall-clock times onto rank 0 for performance reporting.
impure subroutine s_mpi_reduce_min(var_loc)
Reduce a local real value to its global minimum across all ranks.
impure subroutine s_finalize_mpi_common_module
Module deallocation and/or disassociation procedures.
integer(kind=8) halo_size
subroutine s_mpi_decompose_computational_domain
Decompose the computational domain among processors by balancing cells per rank in each coordinate di...
NVIDIA NVTX profiling API bindings for GPU performance instrumentation.
Definition m_nvtx.f90:6