MFC
Exascale flow solver
Loading...
Searching...
No Matches
m_mpi_common.fpp.f90
Go to the documentation of this file.
1# 1 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2!>
3!! @file
4!! @brief Contains module m_mpi_common
5
6# 1 "/home/runner/work/MFC/MFC/src/common/include/case.fpp" 1
7! This file exists so that Fypp can be run without generating case.fpp files for
8! each target. This is useful when generating documentation, for example. This
9! should also let MFC be built with CMake directly, without invoking mfc.sh.
10
11! For pre-process.
12# 8 "/home/runner/work/MFC/MFC/src/common/include/case.fpp"
13
14! For moving immersed boundaries in simulation
15# 12 "/home/runner/work/MFC/MFC/src/common/include/case.fpp"
16# 6 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp" 2
17# 1 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp" 1
18# 1 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 1
19# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
20# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
21# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
22# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
23# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
24# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
25
26# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
27# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
28# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
29
30# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
31
32# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
33
34# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
35
36# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
37
38# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
39
40# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
41
42# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
43! New line at end of file is required for FYPP
44# 2 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
45# 1 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp" 1
46# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
47# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
48# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
49# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
50# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
51# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
52
53# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
54# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
55# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
56
57# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
58
59# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
60
61# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
62
63# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
64
65# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
66
67# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
68
69# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
70! New line at end of file is required for FYPP
71# 2 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp" 2
72
73# 4 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
74# 5 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
75# 6 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
76# 7 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
77# 8 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
78
79# 20 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
80
81# 43 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
82
83# 48 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
84
85# 53 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
86
87# 58 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
88
89# 63 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
90
91# 68 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
92
93# 76 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
94
95# 81 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
96
97# 86 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
98
99# 91 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
100
101# 96 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
102
103# 101 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
104
105# 106 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
106
107# 111 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
108
109# 116 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
110
111# 121 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
112
113# 151 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
114
115# 192 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
116
117# 206 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
118
119# 231 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
120
121# 242 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
122
123# 244 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
124# 255 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
125
126# 284 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
127
128# 294 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
129
130# 304 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
131
132# 313 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
133
134# 330 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
135
136# 340 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
137
138# 347 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
139
140# 353 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
141
142# 359 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
143
144# 365 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
145
146# 371 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
147
148# 377 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
149! New line at end of file is required for FYPP
150# 3 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
151# 1 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp" 1
152# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
153# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
154# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
155# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
156# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
157# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
158
159# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
160# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
161# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
162
163# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
164
165# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
166
167# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
168
169# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
170
171# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
172
173# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
174
175# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
176! New line at end of file is required for FYPP
177# 2 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp" 2
178
179# 7 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
180
181# 17 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
182
183# 22 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
184
185# 27 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
186
187# 32 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
188
189# 37 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
190
191# 42 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
192
193# 47 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
194
195# 52 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
196
197# 57 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
198
199# 62 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
200
201# 73 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
202
203# 78 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
204
205# 83 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
206
207# 88 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
208
209# 103 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
210
211# 131 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
212
213# 160 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
214
215# 175 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
216
217# 193 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
218
219# 215 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
220
221# 244 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
222
223# 259 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
224
225# 269 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
226
227# 278 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
228
229# 294 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
230
231# 304 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
232
233# 311 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
234! New line at end of file is required for FYPP
235# 4 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
236
237! GPU parallel region (scalar reductions, maxval/minval)
238# 23 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
239
240! GPU parallel loop over threads (most common GPU macro)
241# 43 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
242
243! Required closing for GPU_PARALLEL_LOOP
244# 55 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
245
246! Mark routine for device compilation
247# 112 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
248
249! Declare device-resident data
250# 130 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
251
252! Inner loop within a GPU parallel region
253# 145 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
254
255! Scoped GPU data region
256# 164 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
257
258! Host code with device pointers (for MPI with GPU buffers)
259# 193 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
260
261! Allocate device memory (unscoped)
262# 207 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
263
264! Free device memory
265# 219 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
266
267! Atomic operation on device
268# 231 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
269
270! End atomic capture block
271# 242 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
272
273! Copy data between host and device
274# 254 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
275
276! Synchronization barrier
277# 266 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
278
279! Import GPU library module (openacc or omp_lib)
280# 275 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
281
282! Emit code only for AMD compiler
283# 282 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
284
285! Emit code for non-Cray compilers
286# 289 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
287
288! Emit code only for Cray compiler
289# 296 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
290
291! Emit code for non-NVIDIA compilers
292# 303 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
293
294# 305 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
295# 306 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
296! New line at end of file is required for FYPP
297# 2 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp" 2
298
299# 14 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
300
301! Caution: This macro requires the use of a binding script to set CUDA_VISIBLE_DEVICES, such that we have one GPU device per MPI
302! rank. That's because for both cudaMemAdvise (preferred location) and cudaMemPrefetchAsync we use location = device_id = 0. For an
303! example see misc/nvidia_uvm/bind.sh. NVIDIA unified memory page placement hint
304# 57 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
305
306! Allocate and create GPU device memory
307# 77 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
308
309! Free GPU device memory and deallocate
310# 85 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
311
312! Cray-specific GPU pointer setup for vector fields
313# 109 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
314
315! Cray-specific GPU pointer setup for scalar fields
316# 125 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
317
318! Cray-specific GPU pointer setup for acoustic source spatials
319# 150 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
320
321# 156 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
322
323# 163 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
324! New line at end of file is required for FYPP
325# 7 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp" 2
326
327!> @brief MPI communication layer: domain decomposition, halo exchange, reductions, and parallel I/O setup
329
330#ifdef MFC_MPI
331 use mpi !< message passing interface (mpi) module
332#endif
333
336 use m_helper
337 use ieee_arithmetic
338 use m_nvtx
339
340 implicit none
341
342 integer, private :: v_size
343
344# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
345#if defined(MFC_OpenACC)
346# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
347!$acc declare create(v_size)
348# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
349#elif defined(MFC_OpenMP)
350# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
351!$omp declare target (v_size)
352# 24 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
353#endif
354
355 real(wp), private, allocatable, dimension(:) :: buff_send !< Primitive variable send buffer for halo exchange
356 real(wp), private, allocatable, dimension(:) :: buff_recv !< Primitive variable receive buffer for halo exchange
357#ifndef __NVCOMPILER_GPU_UNIFIED_MEM
358
359# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
360#if defined(MFC_OpenACC)
361# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
362!$acc declare create(buff_send, buff_recv)
363# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
364#elif defined(MFC_OpenMP)
365# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
366!$omp declare target (buff_send, buff_recv)
367# 29 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
368#endif
369#endif
370
371 integer(kind=8) :: halo_size
372
373# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
374#if defined(MFC_OpenACC)
375# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
376!$acc declare create(halo_size)
377# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
378#elif defined(MFC_OpenMP)
379# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
380!$omp declare target (halo_size)
381# 33 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
382#endif
383
384contains
385
386 !> Initialize the module.
388
389#ifdef MFC_MPI
390 ! Allocating buff_send/recv and. Please note that for the sake of simplicity, both variables are provided sufficient storage
391 ! to hold the largest buffer in the computational domain.
392
393 if (qbmm .and. .not. polytropic) then
394 v_size = sys_size + 2*nb*nnode
395 else if (chemistry .and. chem_params%diffusion) then
396 v_size = sys_size + 1
397 else
399 end if
400
401 if (n > 0) then
402 if (p > 0) then
403 halo_size = nint(-1._wp + 1._wp*buff_size*(v_size)*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)*(p + 2*buff_size &
404 & + 1)/(cells_bounds%mnp_min + 2*buff_size + 1))
405 else
406 halo_size = -1 + buff_size*(v_size)*(cells_bounds%mn_max + 2*buff_size + 1)
407 end if
408 else
409 halo_size = -1 + buff_size*(v_size)
410 end if
411
412
413# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
414#if defined(MFC_OpenACC)
415# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
416!$acc update device(halo_size, v_size)
417# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
418#elif defined(MFC_OpenMP)
419# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
420!$omp target update to(halo_size, v_size)
421# 63 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
422#endif
423
424#ifndef __NVCOMPILER_GPU_UNIFIED_MEM
425#ifdef MFC_DEBUG
426# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
427 block
428# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
429 use iso_fortran_env, only: output_unit
430# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
431
432# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
433 print *, 'm_mpi_common.fpp:66: ', '@:ALLOCATE(buff_send(0:halo_size), buff_recv(0:halo_size))'
434# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
435
436# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
437 call flush (output_unit)
438# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
439 end block
440# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
441#endif
442# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
443 allocate (buff_send(0:halo_size), buff_recv(0:halo_size))
444# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
445
446# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
447
448# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
449
450# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
451#if defined(MFC_OpenACC)
452# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
453!$acc enter data create(buff_send, buff_recv)
454# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
455#elif defined(MFC_OpenMP)
456# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
457!$omp target enter data map(always,alloc:buff_send, buff_recv)
458# 66 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
459#endif
460#else
461 allocate (buff_send(0:halo_size), buff_recv(0:halo_size))
462
463# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
464#if defined(MFC_OpenACC)
465# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
466!$acc enter data create(capture:buff_send)
467# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
468#elif defined(MFC_OpenMP)
469# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
470!$omp target enter data map(always,alloc:capture:buff_send)
471# 69 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
472#endif
473
474# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
475#if defined(MFC_OpenACC)
476# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
477!$acc enter data create(capture:buff_recv)
478# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
479#elif defined(MFC_OpenMP)
480# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
481!$omp target enter data map(always,alloc:capture:buff_recv)
482# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
483#endif
484#endif
485#endif
486
487 end subroutine s_initialize_mpi_common_module
488
489 !> Initialize the MPI execution environment and query the number of processors and local rank.
490 impure subroutine s_mpi_initialize
491
492#ifdef MFC_MPI
493 integer :: ierr !< Generic flag used to identify and report MPI errors
494
495 call mpi_init(ierr)
496
497 if (ierr /= mpi_success) then
498 print '(A)', 'Unable to initialize MPI environment. Exiting.'
499 call mpi_abort(mpi_comm_world, 1, ierr)
500 end if
501
502 call mpi_comm_size(mpi_comm_world, num_procs, ierr)
503
504 call mpi_comm_rank(mpi_comm_world, proc_rank, ierr)
505#else
506 num_procs = 1
507 proc_rank = 0
508#endif
509
510
511# 97 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
512#if defined(MFC_OpenACC)
513# 97 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
514!$acc update device(num_procs, proc_rank)
515# 97 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
516#elif defined(MFC_OpenMP)
517# 97 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
518!$omp target update to(num_procs, proc_rank)
519# 97 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
520#endif
521
522 end subroutine s_mpi_initialize
523
524 !> Set up MPI I/O data views and variable pointers for parallel file output.
525 impure subroutine s_initialize_mpi_data(q_cons_vf, ib_markers, beta)
526
527 type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
528 type(integer_field), optional, intent(in) :: ib_markers
529 type(scalar_field), intent(in), optional :: beta
530 integer, dimension(num_dims) :: sizes_glb, sizes_loc
531
532#ifdef MFC_MPI
533 integer :: i, j
534 integer :: ierr !< Generic flag used to identify and report MPI errors
535 integer :: alt_sys
536
537 if (present(beta)) then
538 alt_sys = sys_size + 1
539 else
540 alt_sys = sys_size
541 end if
542
543 do i = 1, sys_size
544 mpi_io_data%var(i)%sf => q_cons_vf(i)%sf(0:m,0:n,0:p)
545 end do
546
547 if (present(beta)) then
548 mpi_io_data%var(alt_sys)%sf => beta%sf(0:m,0:n,0:p)
549 end if
550
551 ! Additional variables pb and mv for non-polytropic qbmm
552 if (qbmm .and. .not. polytropic) then
553 do i = 1, nb
554 do j = 1, nnode
555#ifdef MFC_PRE_PROCESS
556 mpi_io_data%var(sys_size + (i - 1)*nnode + j)%sf => pb%sf(0:m,0:n,0:p,j, i)
557 mpi_io_data%var(sys_size + (i - 1)*nnode + j + nb*nnode)%sf => mv%sf(0:m,0:n,0:p,j, i)
558#elif defined (MFC_SIMULATION)
559 mpi_io_data%var(sys_size + (i - 1)*nnode + j)%sf => pb_ts(1)%sf(0:m,0:n,0:p,j, i)
560 mpi_io_data%var(sys_size + (i - 1)*nnode + j + nb*nnode)%sf => mv_ts(1)%sf(0:m,0:n,0:p,j, i)
561#endif
562 end do
563 end do
564 end if
565
566 ! Define global(g) and local(l) sizes for flow variables
567 sizes_glb(1) = m_glb + 1; sizes_loc(1) = m + 1
568 if (n > 0) then
569 sizes_glb(2) = n_glb + 1; sizes_loc(2) = n + 1
570 if (p > 0) then
571 sizes_glb(num_dims) = p_glb + 1; sizes_loc(num_dims) = p + 1
572 end if
573 end if
574
575 ! Define the view for each variable
576 do i = 1, alt_sys
577 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_p, &
578 & mpi_io_data%view(i), ierr)
579 call mpi_type_commit(mpi_io_data%view(i), ierr)
580 end do
581
582#ifndef MFC_POST_PROCESS
583 if (qbmm .and. .not. polytropic) then
584 do i = sys_size + 1, sys_size + 2*nb*nnode
585 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_p, &
586 & mpi_io_data%view(i), ierr)
587 call mpi_type_commit(mpi_io_data%view(i), ierr)
588 end do
589 end if
590#endif
591
592#ifndef MFC_PRE_PROCESS
593 if (present(ib_markers)) then
594 mpi_io_ib_data%var%sf => ib_markers%sf(0:m,0:n,0:p)
595
596 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_integer, &
597 & mpi_io_ib_data%view, ierr)
598 call mpi_type_commit(mpi_io_ib_data%view, ierr)
599 end if
600#endif
601#endif
602
603 end subroutine s_initialize_mpi_data
604
605 !> Set up MPI I/O data views for downsampled (coarsened) parallel file output.
606 subroutine s_initialize_mpi_data_ds(q_cons_vf)
607
608 type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
609 integer, dimension(num_dims) :: sizes_loc
610 integer, dimension(3) :: sf_start_idx
611
612#ifdef MFC_MPI
613 integer :: i, m_ds, n_ds, p_ds, ierr
614
615 sf_start_idx = (/0, 0, 0/)
616
617#ifndef MFC_POST_PROCESS
618 m_ds = int((m + 1)/3) - 1
619 n_ds = int((n + 1)/3) - 1
620 p_ds = int((p + 1)/3) - 1
621#else
622 m_ds = m
623 n_ds = n
624 p_ds = p
625#endif
626
627#ifdef MFC_POST_PROCESS
628 do i = 1, sys_size
629 mpi_io_data%var(i)%sf => q_cons_vf(i)%sf(-1:m_ds + 1,-1:n_ds + 1,-1:p_ds + 1)
630 end do
631#endif
632 ! Define global(g) and local(l) sizes for flow variables
633 sizes_loc(1) = m_ds + 3
634 if (n > 0) then
635 sizes_loc(2) = n_ds + 3
636 if (p > 0) then
637 sizes_loc(num_dims) = p_ds + 3
638 end if
639 end if
640
641 ! Define the view for each variable
642 do i = 1, sys_size
643 call mpi_type_create_subarray(num_dims, sizes_loc, sizes_loc, sf_start_idx, mpi_order_fortran, mpi_p, &
644 & mpi_io_data%view(i), ierr)
645 call mpi_type_commit(mpi_io_data%view(i), ierr)
646 end do
647#endif
648
649 end subroutine s_initialize_mpi_data_ds
650
651 !> Gather variable-length real vectors from all MPI ranks onto the root process.
652 impure subroutine s_mpi_gather_data(my_vector, counts, gathered_vector, root)
653
654 integer, intent(in) :: counts !< Array of vector lengths for each process
655 real(wp), intent(in), dimension(counts) :: my_vector !< Input vector on each process
656 integer, intent(in) :: root !< Rank of the root process
657 real(wp), allocatable, intent(out) :: gathered_vector(:) !< Gathered vector on the root process
658 integer :: i
659 integer :: ierr !< Generic flag used to identify and report MPI errors
660 integer, allocatable :: recounts(:), displs(:)
661
662#ifdef MFC_MPI
663 allocate (recounts(num_procs))
664
665 call mpi_gather(counts, 1, mpi_integer, recounts, 1, mpi_integer, root, mpi_comm_world, ierr)
666
667 allocate (displs(size(recounts)))
668
669 displs(1) = 0
670
671 do i = 2, size(recounts)
672 displs(i) = displs(i - 1) + recounts(i - 1)
673 end do
674
675 allocate (gathered_vector(sum(recounts)))
676 call mpi_gatherv(my_vector, counts, mpi_p, gathered_vector, recounts, displs, mpi_p, root, mpi_comm_world, ierr)
677#endif
678
679 end subroutine s_mpi_gather_data
680
681 !> Gather per-rank time step wall-clock times onto rank 0 for performance reporting.
682 impure subroutine mpi_bcast_time_step_values(proc_time, time_avg)
683
684 real(wp), dimension(0:num_procs - 1), intent(inout) :: proc_time
685 real(wp), intent(inout) :: time_avg
686
687#ifdef MFC_MPI
688 integer :: ierr !< Generic flag used to identify and report MPI errors
689
690 call mpi_gather(time_avg, 1, mpi_p, proc_time(0), 1, mpi_p, 0, mpi_comm_world, ierr)
691#endif
692
693 end subroutine mpi_bcast_time_step_values
694
695 !> Print a case file error with the prohibited condition and message, then abort execution.
696 impure subroutine s_prohibit_abort(condition, message)
697
698 character(len=*), intent(in) :: condition, message
699
700 print *, ""
701 print *, "CASE FILE ERROR"
702 print *, " - Prohibited condition: ", trim(condition)
703 if (len_trim(message) > 0) then
704 print *, " - Note: ", trim(message)
705 end if
706 print *, ""
707 call s_mpi_abort(code=case_file_error_code)
708
709 end subroutine s_prohibit_abort
710
711 !> The goal of this subroutine is to determine the global extrema of the stability criteria in the computational domain. This is
712 !! performed by sifting through the local extrema of each stability criterion. Note that each of the local extrema is from a
713 !! single process, within its assigned section of the computational domain. Finally, note that the global extrema values are
714 !! only bookkeept on the rank 0 processor.
715 impure subroutine s_mpi_reduce_stability_criteria_extrema(icfl_max_loc, vcfl_max_loc, Rc_min_loc, icfl_max_glb, vcfl_max_glb, &
716
717 & Rc_min_glb)
718
719 real(wp), intent(in) :: icfl_max_loc
720 real(wp), intent(in) :: vcfl_max_loc
721 real(wp), intent(in) :: rc_min_loc
722 real(wp), intent(out) :: icfl_max_glb
723 real(wp), intent(out) :: vcfl_max_glb
724 real(wp), intent(out) :: rc_min_glb
725
726 icfl_max_glb = icfl_max_loc
727 vcfl_max_glb = vcfl_max_loc
728 rc_min_glb = rc_min_loc
729
730#ifdef MFC_SIMULATION
731#ifdef MFC_MPI
732 block
733 integer :: ierr
734
735 call mpi_reduce(icfl_max_loc, icfl_max_glb, 1, mpi_p, mpi_max, 0, mpi_comm_world, ierr)
736
737 if (viscous) then
738 call mpi_reduce(vcfl_max_loc, vcfl_max_glb, 1, mpi_p, mpi_max, 0, mpi_comm_world, ierr)
739 call mpi_reduce(rc_min_loc, rc_min_glb, 1, mpi_p, mpi_min, 0, mpi_comm_world, ierr)
740 end if
741 end block
742#else
743 icfl_max_glb = icfl_max_loc
744
745 if (viscous) then
746 vcfl_max_glb = vcfl_max_loc
747 rc_min_glb = rc_min_loc
748 end if
749#endif
750#endif
751
753
754 !> Reduce a local real value to its global sum across all MPI ranks.
755 impure subroutine s_mpi_allreduce_sum(var_loc, var_glb)
756
757 real(wp), intent(in) :: var_loc
758 real(wp), intent(out) :: var_glb
759
760#ifdef MFC_MPI
761 integer :: ierr !< Generic flag used to identify and report MPI errors
762
763 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_sum, mpi_comm_world, ierr)
764#endif
765
766 end subroutine s_mpi_allreduce_sum
767
768 !> Reduce an array of vectors to their global sums across all MPI ranks.
769 impure subroutine s_mpi_allreduce_vectors_sum(var_loc, var_glb, num_vectors, vector_length)
770
771 integer, intent(in) :: num_vectors, vector_length
772 real(wp), dimension(:,:), intent(in) :: var_loc
773 real(wp), dimension(:,:), intent(inout) :: var_glb
774
775#ifdef MFC_MPI
776 integer :: ierr !< Generic flag used to identify and report MPI errors
777
778 if (loc(var_loc) == loc(var_glb)) then
779 call mpi_allreduce(mpi_in_place, var_glb, num_vectors*vector_length, mpi_p, mpi_sum, mpi_comm_world, ierr)
780 else
781 call mpi_allreduce(var_loc, var_glb, num_vectors*vector_length, mpi_p, mpi_sum, mpi_comm_world, ierr)
782 end if
783#else
784 var_glb(1:num_vectors,1:vector_length) = var_loc(1:num_vectors,1:vector_length)
785#endif
786
787 end subroutine s_mpi_allreduce_vectors_sum
788
789 !> Reduce a local integer value to its global sum across all MPI ranks.
790 impure subroutine s_mpi_allreduce_integer_sum(var_loc, var_glb)
791
792 integer, intent(in) :: var_loc
793 integer, intent(out) :: var_glb
794
795#ifdef MFC_MPI
796 integer :: ierr !< Generic flag used to identify and report MPI errors
797
798 call mpi_allreduce(var_loc, var_glb, 1, mpi_integer, mpi_sum, mpi_comm_world, ierr)
799#else
800 var_glb = var_loc
801#endif
802
803 end subroutine s_mpi_allreduce_integer_sum
804
805 !> Reduce a local real value to its global minimum across all MPI ranks.
806 impure subroutine s_mpi_allreduce_min(var_loc, var_glb)
807
808 real(wp), intent(in) :: var_loc
809 real(wp), intent(out) :: var_glb
810
811#ifdef MFC_MPI
812 integer :: ierr !< Generic flag used to identify and report MPI errors
813
814 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_min, mpi_comm_world, ierr)
815#endif
816
817 end subroutine s_mpi_allreduce_min
818
819 !> Reduce a local real value to its global maximum across all MPI ranks.
820 impure subroutine s_mpi_allreduce_max(var_loc, var_glb)
821
822 real(wp), intent(in) :: var_loc
823 real(wp), intent(out) :: var_glb
824
825#ifdef MFC_MPI
826 integer :: ierr !< Generic flag used to identify and report MPI errors
827
828 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_max, mpi_comm_world, ierr)
829#endif
830
831 end subroutine s_mpi_allreduce_max
832
833 !> Reduce a local real value to its global minimum across all ranks
834 impure subroutine s_mpi_reduce_min(var_loc)
835
836 real(wp), intent(inout) :: var_loc
837
838#ifdef MFC_MPI
839 integer :: ierr !< Generic flag used to identify and report MPI errors
840 real(wp) :: var_glb
841
842 call mpi_reduce(var_loc, var_glb, 1, mpi_p, mpi_min, 0, mpi_comm_world, ierr)
843
844 call mpi_bcast(var_glb, 1, mpi_p, 0, mpi_comm_world, ierr)
845
846 var_loc = var_glb
847#endif
848
849 end subroutine s_mpi_reduce_min
850
851 !> Reduce a 2-element variable to its global maximum value with the owning processor rank (MPI_MAXLOC).
852 !> Reduce a local value to its global maximum with location (rank) across all ranks
853 impure subroutine s_mpi_reduce_maxloc(var_loc)
854
855 real(wp), dimension(2), intent(inout) :: var_loc
856
857#ifdef MFC_MPI
858 integer :: ierr !< Generic flag used to identify and report MPI errors
859 real(wp), dimension(2) :: var_glb !< Reduced (max value, rank) pair
860 call mpi_reduce(var_loc, var_glb, 1, mpi_2p, mpi_maxloc, 0, mpi_comm_world, ierr)
861
862 call mpi_bcast(var_glb, 1, mpi_2p, 0, mpi_comm_world, ierr)
863
864 var_loc = var_glb
865#endif
866
867 end subroutine s_mpi_reduce_maxloc
868
869 !> The subroutine terminates the MPI execution environment.
870 impure subroutine s_mpi_abort(prnt, code)
871
872 character(len=*), intent(in), optional :: prnt
873 integer, intent(in), optional :: code
874
875#ifdef MFC_MPI
876 integer :: ierr !< Generic flag used to identify and report MPI errors
877#endif
878
879 if (present(prnt)) then
880 print *, prnt
881 call flush (6)
882 end if
883
884#ifndef MFC_MPI
885 if (present(code)) then
886 stop code
887 else
888 stop 1
889 end if
890#else
891 if (present(code)) then
892 call mpi_abort(mpi_comm_world, code, ierr)
893 else
894 call mpi_abort(mpi_comm_world, 1, ierr)
895 end if
896#endif
897
898 end subroutine s_mpi_abort
899
900 !> Halts all processes until all have reached barrier.
901 impure subroutine s_mpi_barrier
902
903#ifdef MFC_MPI
904 integer :: ierr !< Generic flag used to identify and report MPI errors
905
906 call mpi_barrier(mpi_comm_world, ierr)
907#endif
908
909 end subroutine s_mpi_barrier
910
911 !> The subroutine finalizes the MPI execution environment.
912 impure subroutine s_mpi_finalize
913
914#ifdef MFC_MPI
915 integer :: ierr !< Generic flag used to identify and report MPI errors
916
917 call mpi_finalize(ierr)
918#endif
919
920 end subroutine s_mpi_finalize
921
922 !> The goal of this procedure is to populate the buffers of the cell-average conservative variables by communicating with the
923 !! neighboring processors.
924 subroutine s_mpi_sendrecv_variables_buffers(q_comm, mpi_dir, pbc_loc, nVar, pb_in, mv_in, q_T_sf)
925
926 type(scalar_field), dimension(1:), intent(inout) :: q_comm
927 real(stp), optional, dimension(idwbuff(1)%beg:,idwbuff(2)%beg:,idwbuff(3)%beg:,1:,1:), intent(inout) :: pb_in, mv_in
928 integer, intent(in) :: mpi_dir, pbc_loc, nVar
929 integer :: i, j, k, l, r, q !< Generic loop iterators
930 integer :: buffer_counts(1:3), buffer_count
931 type(int_bounds_info) :: boundary_conditions(1:3)
932 integer :: beg_end(1:2), grid_dims(1:3)
933 integer :: dst_proc, src_proc, recv_tag, send_tag
934 logical :: beg_end_geq_0, qbmm_comm, chem_diff_comm
935 integer :: pack_offset, unpack_offset
936 type(scalar_field), optional, intent(inout) :: q_T_sf
937
938#ifdef MFC_MPI
939 integer :: ierr !< Generic flag used to identify and report MPI errors
940
941 call nvtxstartrange("RHS-COMM-PACKBUF")
942
943 qbmm_comm = .false.
944 chem_diff_comm = .false.
945
946 if (present(pb_in) .and. present(mv_in) .and. qbmm .and. .not. polytropic) then
947 qbmm_comm = .true.
948 v_size = nvar + 2*nb*nnode
949 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
950 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
951 else if (present(q_t_sf) .and. chemistry .and. chem_params%diffusion) then
952 chem_diff_comm = .true.
953 v_size = nvar + 1
954 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
955 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
956 else
957 v_size = nvar
958 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
959 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
960 end if
961
962
963# 539 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
964#if defined(MFC_OpenACC)
965# 539 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
966!$acc update device(v_size)
967# 539 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
968#elif defined(MFC_OpenMP)
969# 539 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
970!$omp target update to(v_size)
971# 539 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
972#endif
973
974 buffer_count = buffer_counts(mpi_dir)
975 boundary_conditions = (/bc_x, bc_y, bc_z/)
976 beg_end = (/boundary_conditions(mpi_dir)%beg, boundary_conditions(mpi_dir)%end/)
977 beg_end_geq_0 = beg_end(max(pbc_loc, 0) - pbc_loc + 1) >= 0
978
979 ! Implements: pbc_loc bc_x >= 0 -> [send/recv]_tag [dst/src]_proc -1 (=0) 0 -> [1,0] [0,0] | 0 0 [1,0] [beg,beg] -1 (=0) 1
980 ! -> [0,0] [1,0] | 0 1 [0,0] [end,beg] +1 (=1) 0 -> [0,1] [1,1] | 1 0 [0,1] [end,end] +1 (=1) 1 -> [1,1] [0,1] | 1 1 [1,1]
981 ! [beg,end]
982
983 send_tag = f_logical_to_int(.not. f_xor(beg_end_geq_0, pbc_loc == 1))
984 recv_tag = f_logical_to_int(pbc_loc == 1)
985
986 dst_proc = beg_end(1 + f_logical_to_int(f_xor(pbc_loc == 1, beg_end_geq_0)))
987 src_proc = beg_end(1 + f_logical_to_int(pbc_loc == 1))
988
989 grid_dims = (/m, n, p/)
990
991 pack_offset = 0
992 if (f_xor(pbc_loc == 1, beg_end_geq_0)) then
993 pack_offset = grid_dims(mpi_dir) - buff_size + 1
994 end if
995
996 unpack_offset = 0
997 if (pbc_loc == 1) then
998 unpack_offset = grid_dims(mpi_dir) + buff_size + 1
999 end if
1000
1001 ! Pack Buffer to Send
1002# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1003 if (mpi_dir == 1) then
1004# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1005
1006# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1007
1008# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1009#if defined(MFC_OpenACC)
1010# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1011!$acc parallel loop collapse(4) gang vector default(present) private(r)
1012# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1013#elif defined(MFC_OpenMP)
1014# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1015
1016# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1017
1018# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1019
1020# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1021!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1022# 572 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1023#endif
1024 do l = 0, p
1025 do k = 0, n
1026 do j = 0, buff_size - 1
1027 do i = 1, nvar
1028 r = (i - 1) + v_size*(j + buff_size*(k + (n + 1)*l))
1029 buff_send(r) = real(q_comm(i)%sf(j + pack_offset, k, l), kind=wp)
1030 end do
1031 end do
1032 end do
1033 end do
1034
1035# 583 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1036#if defined(MFC_OpenACC)
1037# 583 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1038!$acc end parallel loop
1039# 583 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1040#elif defined(MFC_OpenMP)
1041# 583 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1042
1043# 583 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1044!$omp end target teams loop
1045# 583 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1046#endif
1047
1048 if (chem_diff_comm) then
1049
1050# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1051
1052# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1053#if defined(MFC_OpenACC)
1054# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1055!$acc parallel loop collapse(3) gang vector default(present) private(r)
1056# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1057#elif defined(MFC_OpenMP)
1058# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1059
1060# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1061
1062# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1063
1064# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1065!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1066# 586 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1067#endif
1068 do l = 0, p
1069 do k = 0, n
1070 do j = 0, buff_size - 1
1071 r = nvar + v_size*(j + buff_size*(k + (n + 1)*l))
1072 buff_send(r) = real(q_t_sf%sf(j + pack_offset, k, l), kind=wp)
1073 end do
1074 end do
1075 end do
1076
1077# 595 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1078#if defined(MFC_OpenACC)
1079# 595 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1080!$acc end parallel loop
1081# 595 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1082#elif defined(MFC_OpenMP)
1083# 595 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1084
1085# 595 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1086!$omp end target teams loop
1087# 595 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1088#endif
1089 end if
1090
1091 if (qbmm_comm) then
1092
1093# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1094
1095# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1096#if defined(MFC_OpenACC)
1097# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1098!$acc parallel loop collapse(4) gang vector default(present) private(r)
1099# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1100#elif defined(MFC_OpenMP)
1101# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1102
1103# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1104
1105# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1106
1107# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1108!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1109# 599 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1110#endif
1111 do l = 0, p
1112 do k = 0, n
1113 do j = 0, buff_size - 1
1114 do i = nvar + 1, nvar + nnode
1115 do q = 1, nb
1116 r = (i - 1) + (q - 1)*nnode + v_size*(j + buff_size*(k + (n + 1)*l))
1117 buff_send(r) = real(pb_in(j + pack_offset, k, l, i - nvar, q), kind=wp)
1118 end do
1119 end do
1120 end do
1121 end do
1122 end do
1123
1124# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1125#if defined(MFC_OpenACC)
1126# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1127!$acc end parallel loop
1128# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1129#elif defined(MFC_OpenMP)
1130# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1131
1132# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1133!$omp end target teams loop
1134# 612 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1135#endif
1136
1137
1138# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1139
1140# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1141#if defined(MFC_OpenACC)
1142# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1143!$acc parallel loop collapse(5) gang vector default(present) private(r)
1144# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1145#elif defined(MFC_OpenMP)
1146# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1147
1148# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1149
1150# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1151
1152# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1153!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1154# 614 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1155#endif
1156 do l = 0, p
1157 do k = 0, n
1158 do j = 0, buff_size - 1
1159 do i = nvar + 1, nvar + nnode
1160 do q = 1, nb
1161 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*(j + buff_size*(k + (n + 1)*l))
1162 buff_send(r) = real(mv_in(j + pack_offset, k, l, i - nvar, q), kind=wp)
1163 end do
1164 end do
1165 end do
1166 end do
1167 end do
1168
1169# 627 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1170#if defined(MFC_OpenACC)
1171# 627 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1172!$acc end parallel loop
1173# 627 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1174#elif defined(MFC_OpenMP)
1175# 627 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1176
1177# 627 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1178!$omp end target teams loop
1179# 627 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1180#endif
1181 end if
1182# 752 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1183 end if
1184# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1185 if (mpi_dir == 2) then
1186# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1187
1188# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1189
1190# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1191#if defined(MFC_OpenACC)
1192# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1193!$acc parallel loop collapse(4) gang vector default(present) private(r)
1194# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1195#elif defined(MFC_OpenMP)
1196# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1197
1198# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1199
1200# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1201
1202# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1203!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1204# 630 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1205#endif
1206 do i = 1, nvar
1207 do l = 0, p
1208 do k = 0, buff_size - 1
1209 do j = -buff_size, m + buff_size
1210 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k + buff_size*l))
1211 buff_send(r) = real(q_comm(i)%sf(j, k + pack_offset, l), kind=wp)
1212 end do
1213 end do
1214 end do
1215 end do
1216
1217# 641 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1218#if defined(MFC_OpenACC)
1219# 641 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1220!$acc end parallel loop
1221# 641 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1222#elif defined(MFC_OpenMP)
1223# 641 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1224
1225# 641 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1226!$omp end target teams loop
1227# 641 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1228#endif
1229
1230 if (chem_diff_comm) then
1231
1232# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1233
1234# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1235#if defined(MFC_OpenACC)
1236# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1237!$acc parallel loop collapse(3) gang vector default(present) private(r)
1238# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1239#elif defined(MFC_OpenMP)
1240# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1241
1242# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1243
1244# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1245
1246# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1247!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1248# 644 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1249#endif
1250 do l = 0, p
1251 do k = 0, buff_size - 1
1252 do j = -buff_size, m + buff_size
1253 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k + buff_size*l))
1254 buff_send(r) = real(q_t_sf%sf(j, k + pack_offset, l), kind=wp)
1255 end do
1256 end do
1257 end do
1258
1259# 653 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1260#if defined(MFC_OpenACC)
1261# 653 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1262!$acc end parallel loop
1263# 653 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1264#elif defined(MFC_OpenMP)
1265# 653 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1266
1267# 653 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1268!$omp end target teams loop
1269# 653 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1270#endif
1271 end if
1272
1273 if (qbmm_comm) then
1274
1275# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1276
1277# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1278#if defined(MFC_OpenACC)
1279# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1280!$acc parallel loop collapse(5) gang vector default(present) private(r)
1281# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1282#elif defined(MFC_OpenMP)
1283# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1284
1285# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1286
1287# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1288
1289# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1290!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1291# 657 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1292#endif
1293 do i = nvar + 1, nvar + nnode
1294 do l = 0, p
1295 do k = 0, buff_size - 1
1296 do j = -buff_size, m + buff_size
1297 do q = 1, nb
1298 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k &
1299 & + buff_size*l))
1300 buff_send(r) = real(pb_in(j, k + pack_offset, l, i - nvar, q), kind=wp)
1301 end do
1302 end do
1303 end do
1304 end do
1305 end do
1306
1307# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1308#if defined(MFC_OpenACC)
1309# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1310!$acc end parallel loop
1311# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1312#elif defined(MFC_OpenMP)
1313# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1314
1315# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1316!$omp end target teams loop
1317# 671 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1318#endif
1319
1320
1321# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1322
1323# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1324#if defined(MFC_OpenACC)
1325# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1326!$acc parallel loop collapse(5) gang vector default(present) private(r)
1327# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1328#elif defined(MFC_OpenMP)
1329# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1330
1331# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1332
1333# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1334
1335# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1336!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1337# 673 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1338#endif
1339 do i = nvar + 1, nvar + nnode
1340 do l = 0, p
1341 do k = 0, buff_size - 1
1342 do j = -buff_size, m + buff_size
1343 do q = 1, nb
1344 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
1345 & + 1)*(k + buff_size*l))
1346 buff_send(r) = real(mv_in(j, k + pack_offset, l, i - nvar, q), kind=wp)
1347 end do
1348 end do
1349 end do
1350 end do
1351 end do
1352
1353# 687 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1354#if defined(MFC_OpenACC)
1355# 687 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1356!$acc end parallel loop
1357# 687 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1358#elif defined(MFC_OpenMP)
1359# 687 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1360
1361# 687 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1362!$omp end target teams loop
1363# 687 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1364#endif
1365 end if
1366# 752 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1367 end if
1368# 570 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1369 if (mpi_dir == 3) then
1370# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1371
1372# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1373
1374# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1375#if defined(MFC_OpenACC)
1376# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1377!$acc parallel loop collapse(4) gang vector default(present) private(r)
1378# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1379#elif defined(MFC_OpenMP)
1380# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1381
1382# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1383
1384# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1385
1386# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1387!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1388# 690 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1389#endif
1390 do i = 1, nvar
1391 do l = 0, buff_size - 1
1392 do k = -buff_size, n + buff_size
1393 do j = -buff_size, m + buff_size
1394 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
1395 & + 2*buff_size + 1)*l))
1396 buff_send(r) = real(q_comm(i)%sf(j, k, l + pack_offset), kind=wp)
1397 end do
1398 end do
1399 end do
1400 end do
1401
1402# 702 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1403#if defined(MFC_OpenACC)
1404# 702 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1405!$acc end parallel loop
1406# 702 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1407#elif defined(MFC_OpenMP)
1408# 702 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1409
1410# 702 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1411!$omp end target teams loop
1412# 702 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1413#endif
1414
1415 if (chem_diff_comm) then
1416
1417# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1418
1419# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1420#if defined(MFC_OpenACC)
1421# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1422!$acc parallel loop collapse(3) gang vector default(present) private(r)
1423# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1424#elif defined(MFC_OpenMP)
1425# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1426
1427# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1428
1429# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1430
1431# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1432!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1433# 705 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1434#endif
1435 do l = 0, buff_size - 1
1436 do k = -buff_size, n + buff_size
1437 do j = -buff_size, m + buff_size
1438 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
1439 & + 2*buff_size + 1)*l))
1440 buff_send(r) = real(q_t_sf%sf(j, k, l + pack_offset), kind=wp)
1441 end do
1442 end do
1443 end do
1444
1445# 715 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1446#if defined(MFC_OpenACC)
1447# 715 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1448!$acc end parallel loop
1449# 715 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1450#elif defined(MFC_OpenMP)
1451# 715 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1452
1453# 715 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1454!$omp end target teams loop
1455# 715 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1456#endif
1457 end if
1458
1459 if (qbmm_comm) then
1460
1461# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1462
1463# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1464#if defined(MFC_OpenACC)
1465# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1466!$acc parallel loop collapse(5) gang vector default(present) private(r)
1467# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1468#elif defined(MFC_OpenMP)
1469# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1470
1471# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1472
1473# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1474
1475# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1476!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1477# 719 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1478#endif
1479 do i = nvar + 1, nvar + nnode
1480 do l = 0, buff_size - 1
1481 do k = -buff_size, n + buff_size
1482 do j = -buff_size, m + buff_size
1483 do q = 1, nb
1484 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
1485 & + buff_size) + (n + 2*buff_size + 1)*l))
1486 buff_send(r) = real(pb_in(j, k, l + pack_offset, i - nvar, q), kind=wp)
1487 end do
1488 end do
1489 end do
1490 end do
1491 end do
1492
1493# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1494#if defined(MFC_OpenACC)
1495# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1496!$acc end parallel loop
1497# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1498#elif defined(MFC_OpenMP)
1499# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1500
1501# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1502!$omp end target teams loop
1503# 733 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1504#endif
1505
1506
1507# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1508
1509# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1510#if defined(MFC_OpenACC)
1511# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1512!$acc parallel loop collapse(5) gang vector default(present) private(r)
1513# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1514#elif defined(MFC_OpenMP)
1515# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1516
1517# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1518
1519# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1520
1521# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1522!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1523# 735 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1524#endif
1525 do i = nvar + 1, nvar + nnode
1526 do l = 0, buff_size - 1
1527 do k = -buff_size, n + buff_size
1528 do j = -buff_size, m + buff_size
1529 do q = 1, nb
1530 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
1531 & + 1)*((k + buff_size) + (n + 2*buff_size + 1)*l))
1532 buff_send(r) = real(mv_in(j, k, l + pack_offset, i - nvar, q), kind=wp)
1533 end do
1534 end do
1535 end do
1536 end do
1537 end do
1538
1539# 749 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1540#if defined(MFC_OpenACC)
1541# 749 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1542!$acc end parallel loop
1543# 749 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1544#elif defined(MFC_OpenMP)
1545# 749 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1546
1547# 749 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1548!$omp end target teams loop
1549# 749 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1550#endif
1551 end if
1552# 752 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1553 end if
1554# 754 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1555 call nvtxendrange ! Packbuf
1556
1557 ! Send/Recv
1558#ifdef MFC_SIMULATION
1559# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1560 if (rdma_mpi .eqv. .false.) then
1561# 771 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1562 call nvtxstartrange("RHS-COMM-DEV2HOST")
1563
1564# 772 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1565#if defined(MFC_OpenACC)
1566# 772 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1567!$acc update host(buff_send)
1568# 772 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1569#elif defined(MFC_OpenMP)
1570# 772 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1571!$omp target update from(buff_send)
1572# 772 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1573#endif
1574 call nvtxendrange
1575 call nvtxstartrange("RHS-COMM-SENDRECV-NO-RMDA")
1576
1577 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1578 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1579
1580 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1581
1582 call nvtxstartrange("RHS-COMM-HOST2DEV")
1583
1584# 782 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1585#if defined(MFC_OpenACC)
1586# 782 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1587!$acc update device(buff_recv)
1588# 782 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1589#elif defined(MFC_OpenMP)
1590# 782 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1591!$omp target update to(buff_recv)
1592# 782 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1593#endif
1594 call nvtxendrange
1595# 785 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1596 end if
1597# 759 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1598 if (rdma_mpi .eqv. .true.) then
1599# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1600
1601# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1602#if defined(MFC_OpenACC)
1603# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1604!$acc host_data use_device(buff_send, buff_recv)
1605# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1606 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1607# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1608
1609# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1610 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1611# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1612 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1613# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1614
1615# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1616 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1617# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1618!$acc end host_data
1619# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1620#elif defined(MFC_OpenMP)
1621# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1622!$omp target data use_device_addr(buff_send, buff_recv)
1623# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1624 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1625# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1626
1627# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1628 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1629# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1630 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1631# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1632
1633# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1634 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1635# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1636!$omp end target data
1637# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1638#else
1639# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1640 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1641# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1642
1643# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1644 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1645# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1646 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1647# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1648
1649# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1650 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1651# 761 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1652#endif
1653# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1654
1655# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1656#if defined(MFC_OpenACC)
1657# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1658!$acc wait
1659# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1660#elif defined(MFC_OpenMP)
1661# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1662!$omp barrier
1663# 769 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1664#endif
1665# 785 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1666 end if
1667# 787 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1668#else
1669 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, src_proc, recv_tag, &
1670 & mpi_comm_world, mpi_status_ignore, ierr)
1671#endif
1672
1673 ! Unpack Received Buffer
1674 call nvtxstartrange("RHS-COMM-UNPACKBUF")
1675# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1676 if (mpi_dir == 1) then
1677# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1678
1679# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1680
1681# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1682#if defined(MFC_OpenACC)
1683# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1684!$acc parallel loop collapse(4) gang vector default(present) private(r)
1685# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1686#elif defined(MFC_OpenMP)
1687# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1688
1689# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1690
1691# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1692
1693# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1694!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1695# 797 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1696#endif
1697 do l = 0, p
1698 do k = 0, n
1699 do j = -buff_size, -1
1700 do i = 1, nvar
1701 r = (i - 1) + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1702 q_comm(i)%sf(j + unpack_offset, k, l) = real(buff_recv(r), kind=stp)
1703#if defined(__INTEL_COMPILER)
1704 if (ieee_is_nan(q_comm(i)%sf(j + unpack_offset, k, l))) then
1705 print *, "Error", j, k, l, i
1706 call s_mpi_abort("NaN(s) in recv")
1707 end if
1708#endif
1709 end do
1710 end do
1711 end do
1712 end do
1713
1714# 814 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1715#if defined(MFC_OpenACC)
1716# 814 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1717!$acc end parallel loop
1718# 814 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1719#elif defined(MFC_OpenMP)
1720# 814 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1721
1722# 814 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1723!$omp end target teams loop
1724# 814 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1725#endif
1726
1727 if (chem_diff_comm) then
1728
1729# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1730
1731# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1732#if defined(MFC_OpenACC)
1733# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1734!$acc parallel loop collapse(3) gang vector default(present) private(r)
1735# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1736#elif defined(MFC_OpenMP)
1737# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1738
1739# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1740
1741# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1742
1743# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1744!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1745# 817 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1746#endif
1747 do l = 0, p
1748 do k = 0, n
1749 do j = -buff_size, -1
1750 r = nvar + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1751 q_t_sf%sf(j + unpack_offset, k, l) = real(buff_recv(r), kind=stp)
1752#if defined(__INTEL_COMPILER)
1753 if (ieee_is_nan(q_t_sf%sf(j + unpack_offset, k, l))) then
1754 print *, "Error", j, k, l
1755 call s_mpi_abort("NaN(s) in recv")
1756 end if
1757#endif
1758 end do
1759 end do
1760 end do
1761
1762# 832 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1763#if defined(MFC_OpenACC)
1764# 832 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1765!$acc end parallel loop
1766# 832 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1767#elif defined(MFC_OpenMP)
1768# 832 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1769
1770# 832 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1771!$omp end target teams loop
1772# 832 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1773#endif
1774 end if
1775
1776 if (qbmm_comm) then
1777
1778# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1779
1780# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1781#if defined(MFC_OpenACC)
1782# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1783!$acc parallel loop collapse(5) gang vector default(present) private(r)
1784# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1785#elif defined(MFC_OpenMP)
1786# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1787
1788# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1789
1790# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1791
1792# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1793!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1794# 836 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1795#endif
1796 do l = 0, p
1797 do k = 0, n
1798 do j = -buff_size, -1
1799 do i = nvar + 1, nvar + nnode
1800 do q = 1, nb
1801 r = (i - 1) + (q - 1)*nnode + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1802 pb_in(j + unpack_offset, k, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1803 end do
1804 end do
1805 end do
1806 end do
1807 end do
1808
1809# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1810#if defined(MFC_OpenACC)
1811# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1812!$acc end parallel loop
1813# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1814#elif defined(MFC_OpenMP)
1815# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1816
1817# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1818!$omp end target teams loop
1819# 849 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1820#endif
1821
1822
1823# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1824
1825# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1826#if defined(MFC_OpenACC)
1827# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1828!$acc parallel loop collapse(5) gang vector default(present) private(r)
1829# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1830#elif defined(MFC_OpenMP)
1831# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1832
1833# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1834
1835# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1836
1837# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1838!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1839# 851 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1840#endif
1841 do l = 0, p
1842 do k = 0, n
1843 do j = -buff_size, -1
1844 do i = nvar + 1, nvar + nnode
1845 do q = 1, nb
1846 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1847 mv_in(j + unpack_offset, k, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1848 end do
1849 end do
1850 end do
1851 end do
1852 end do
1853
1854# 864 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1855#if defined(MFC_OpenACC)
1856# 864 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1857!$acc end parallel loop
1858# 864 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1859#elif defined(MFC_OpenMP)
1860# 864 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1861
1862# 864 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1863!$omp end target teams loop
1864# 864 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1865#endif
1866 end if
1867# 1013 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1868 end if
1869# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1870 if (mpi_dir == 2) then
1871# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1872
1873# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1874
1875# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1876#if defined(MFC_OpenACC)
1877# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1878!$acc parallel loop collapse(4) gang vector default(present) private(r)
1879# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1880#elif defined(MFC_OpenMP)
1881# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1882
1883# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1884
1885# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1886
1887# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1888!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1889# 867 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1890#endif
1891 do i = 1, nvar
1892 do l = 0, p
1893 do k = -buff_size, -1
1894 do j = -buff_size, m + buff_size
1895 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + buff_size*l))
1896 q_comm(i)%sf(j, k + unpack_offset, l) = real(buff_recv(r), kind=stp)
1897#if defined(__INTEL_COMPILER)
1898 if (ieee_is_nan(q_comm(i)%sf(j, k + unpack_offset, l))) then
1899 print *, "Error", j, k, l, i
1900 call s_mpi_abort("NaN(s) in recv")
1901 end if
1902#endif
1903 end do
1904 end do
1905 end do
1906 end do
1907
1908# 884 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1909#if defined(MFC_OpenACC)
1910# 884 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1911!$acc end parallel loop
1912# 884 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1913#elif defined(MFC_OpenMP)
1914# 884 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1915
1916# 884 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1917!$omp end target teams loop
1918# 884 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1919#endif
1920
1921 if (chem_diff_comm) then
1922
1923# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1924
1925# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1926#if defined(MFC_OpenACC)
1927# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1928!$acc parallel loop collapse(3) gang vector default(present) private(r)
1929# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1930#elif defined(MFC_OpenMP)
1931# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1932
1933# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1934
1935# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1936
1937# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1938!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1939# 887 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1940#endif
1941 do l = 0, p
1942 do k = -buff_size, -1
1943 do j = -buff_size, m + buff_size
1944 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + buff_size*l))
1945 q_t_sf%sf(j, k + unpack_offset, l) = real(buff_recv(r), kind=stp)
1946#if defined(__INTEL_COMPILER)
1947 if (ieee_is_nan(q_t_sf%sf(j, k + unpack_offset, l))) then
1948 print *, "Error", j, k, l
1949 call s_mpi_abort("NaN(s) in recv")
1950 end if
1951#endif
1952 end do
1953 end do
1954 end do
1955
1956# 902 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1957#if defined(MFC_OpenACC)
1958# 902 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1959!$acc end parallel loop
1960# 902 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1961#elif defined(MFC_OpenMP)
1962# 902 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1963
1964# 902 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1965!$omp end target teams loop
1966# 902 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1967#endif
1968 end if
1969
1970 if (qbmm_comm) then
1971
1972# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1973
1974# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1975#if defined(MFC_OpenACC)
1976# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1977!$acc parallel loop collapse(5) gang vector default(present) private(r)
1978# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1979#elif defined(MFC_OpenMP)
1980# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1981
1982# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1983
1984# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1985
1986# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1987!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1988# 906 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1989#endif
1990 do i = nvar + 1, nvar + nnode
1991 do l = 0, p
1992 do k = -buff_size, -1
1993 do j = -buff_size, m + buff_size
1994 do q = 1, nb
1995 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
1996 & + buff_size) + buff_size*l))
1997 pb_in(j, k + unpack_offset, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1998 end do
1999 end do
2000 end do
2001 end do
2002 end do
2003
2004# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2005#if defined(MFC_OpenACC)
2006# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2007!$acc end parallel loop
2008# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2009#elif defined(MFC_OpenMP)
2010# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2011
2012# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2013!$omp end target teams loop
2014# 920 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2015#endif
2016
2017
2018# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2019
2020# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2021#if defined(MFC_OpenACC)
2022# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2023!$acc parallel loop collapse(5) gang vector default(present) private(r)
2024# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2025#elif defined(MFC_OpenMP)
2026# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2027
2028# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2029
2030# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2031
2032# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2033!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2034# 922 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2035#endif
2036 do i = nvar + 1, nvar + nnode
2037 do l = 0, p
2038 do k = -buff_size, -1
2039 do j = -buff_size, m + buff_size
2040 do q = 1, nb
2041 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
2042 & + 1)*((k + buff_size) + buff_size*l))
2043 mv_in(j, k + unpack_offset, l, i - nvar, q) = real(buff_recv(r), kind=stp)
2044 end do
2045 end do
2046 end do
2047 end do
2048 end do
2049
2050# 936 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2051#if defined(MFC_OpenACC)
2052# 936 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2053!$acc end parallel loop
2054# 936 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2055#elif defined(MFC_OpenMP)
2056# 936 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2057
2058# 936 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2059!$omp end target teams loop
2060# 936 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2061#endif
2062 end if
2063# 1013 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2064 end if
2065# 795 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2066 if (mpi_dir == 3) then
2067# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2068
2069# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2070
2071# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2072#if defined(MFC_OpenACC)
2073# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2074!$acc parallel loop collapse(4) gang vector default(present) private(r)
2075# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2076#elif defined(MFC_OpenMP)
2077# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2078
2079# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2080
2081# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2082
2083# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2084!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2085# 939 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2086#endif
2087 do i = 1, nvar
2088 do l = -buff_size, -1
2089 do k = -buff_size, n + buff_size
2090 do j = -buff_size, m + buff_size
2091 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
2092 & + 2*buff_size + 1)*(l + buff_size)))
2093 q_comm(i)%sf(j, k, l + unpack_offset) = real(buff_recv(r), kind=stp)
2094#if defined(__INTEL_COMPILER)
2095 if (ieee_is_nan(q_comm(i)%sf(j, k, l + unpack_offset))) then
2096 print *, "Error", j, k, l, i
2097 call s_mpi_abort("NaN(s) in recv")
2098 end if
2099#endif
2100 end do
2101 end do
2102 end do
2103 end do
2104
2105# 957 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2106#if defined(MFC_OpenACC)
2107# 957 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2108!$acc end parallel loop
2109# 957 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2110#elif defined(MFC_OpenMP)
2111# 957 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2112
2113# 957 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2114!$omp end target teams loop
2115# 957 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2116#endif
2117
2118 if (chem_diff_comm) then
2119
2120# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2121
2122# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2123#if defined(MFC_OpenACC)
2124# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2125!$acc parallel loop collapse(3) gang vector default(present) private(r)
2126# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2127#elif defined(MFC_OpenMP)
2128# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2129
2130# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2131
2132# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2133
2134# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2135!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2136# 960 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2137#endif
2138 do l = -buff_size, -1
2139 do k = -buff_size, n + buff_size
2140 do j = -buff_size, m + buff_size
2141 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
2142 & + 2*buff_size + 1)*(l + buff_size)))
2143 q_t_sf%sf(j, k, l + unpack_offset) = real(buff_recv(r), kind=stp)
2144#if defined(__INTEL_COMPILER)
2145 if (ieee_is_nan(q_t_sf%sf(j, k, l + unpack_offset))) then
2146 print *, "Error", j, k, l
2147 call s_mpi_abort("NaN(s) in recv")
2148 end if
2149#endif
2150 end do
2151 end do
2152 end do
2153
2154# 976 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2155#if defined(MFC_OpenACC)
2156# 976 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2157!$acc end parallel loop
2158# 976 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2159#elif defined(MFC_OpenMP)
2160# 976 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2161
2162# 976 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2163!$omp end target teams loop
2164# 976 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2165#endif
2166 end if
2167
2168 if (qbmm_comm) then
2169
2170# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2171
2172# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2173#if defined(MFC_OpenACC)
2174# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2175!$acc parallel loop collapse(5) gang vector default(present) private(r)
2176# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2177#elif defined(MFC_OpenMP)
2178# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2179
2180# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2181
2182# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2183
2184# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2185!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2186# 980 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2187#endif
2188 do i = nvar + 1, nvar + nnode
2189 do l = -buff_size, -1
2190 do k = -buff_size, n + buff_size
2191 do j = -buff_size, m + buff_size
2192 do q = 1, nb
2193 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
2194 & + buff_size) + (n + 2*buff_size + 1)*(l + buff_size)))
2195 pb_in(j, k, l + unpack_offset, i - nvar, q) = real(buff_recv(r), kind=stp)
2196 end do
2197 end do
2198 end do
2199 end do
2200 end do
2201
2202# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2203#if defined(MFC_OpenACC)
2204# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2205!$acc end parallel loop
2206# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2207#elif defined(MFC_OpenMP)
2208# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2209
2210# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2211!$omp end target teams loop
2212# 994 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2213#endif
2214
2215
2216# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2217
2218# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2219#if defined(MFC_OpenACC)
2220# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2221!$acc parallel loop collapse(5) gang vector default(present) private(r)
2222# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2223#elif defined(MFC_OpenMP)
2224# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2225
2226# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2227
2228# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2229
2230# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2231!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2232# 996 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2233#endif
2234 do i = nvar + 1, nvar + nnode
2235 do l = -buff_size, -1
2236 do k = -buff_size, n + buff_size
2237 do j = -buff_size, m + buff_size
2238 do q = 1, nb
2239 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
2240 & + 1)*((k + buff_size) + (n + 2*buff_size + 1)*(l + buff_size)))
2241 mv_in(j, k, l + unpack_offset, i - nvar, q) = real(buff_recv(r), kind=stp)
2242 end do
2243 end do
2244 end do
2245 end do
2246 end do
2247
2248# 1010 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2249#if defined(MFC_OpenACC)
2250# 1010 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2251!$acc end parallel loop
2252# 1010 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2253#elif defined(MFC_OpenMP)
2254# 1010 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2255
2256# 1010 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2257!$omp end target teams loop
2258# 1010 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2259#endif
2260 end if
2261# 1013 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2262 end if
2263# 1015 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2264 call nvtxendrange
2265#endif
2266
2268
2269 !> Decompose the computational domain among processors by balancing cells per rank in each coordinate direction.
2271
2272#ifdef MFC_MPI
2273 integer :: num_procs_x, num_procs_y, num_procs_z !< Optimal number of processors in the x-, y- and z-directions
2274 !> Non-optimal number of processors in the x-, y- and z-directions
2275 real(wp) :: tmp_num_procs_x, tmp_num_procs_y, tmp_num_procs_z
2276 real(wp) :: fct_min !< Processor factorization (fct) minimization parameter
2277 integer :: MPI_COMM_CART !< Cartesian processor topology communicator
2278 integer :: rem_cells !< Remaining cells after distribution among processors
2279 integer :: recon_order !< WENO or MUSCL reconstruction order
2280 integer :: i, j !< Generic loop iterators
2281 integer :: ierr !< Generic flag used to identify and report MPI errors
2282
2283 if (recon_type == weno_type) then
2284 recon_order = weno_order
2285 else
2286 recon_order = muscl_order
2287 end if
2288
2289 if (num_procs == 1 .and. parallel_io) then
2290 do i = 1, num_dims
2291 start_idx(i) = 0
2292 end do
2293 return
2294 end if
2295
2296 if (igr) then
2297 recon_order = igr_order
2298 end if
2299
2300 ! 3D Cartesian Processor Topology
2301 if (n > 0) then
2302 if (p > 0) then
2303 if (fft_wrt) then
2304 ! Initial estimate of optimal processor topology
2305 num_procs_x = 1
2306 num_procs_y = 1
2307 num_procs_z = num_procs
2308 ierr = -1
2309
2310 ! Benchmarking the quality of this initial guess
2311 tmp_num_procs_y = num_procs_y
2312 tmp_num_procs_z = num_procs_z
2313 fct_min = 10._wp*abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2314
2315 ! Optimization of the initial processor topology
2316 do i = 1, num_procs
2317 if (mod(num_procs, i) == 0 .and. (n + 1)/i >= num_stcls_min*recon_order) then
2318 tmp_num_procs_y = i
2319 tmp_num_procs_z = num_procs/i
2320
2321 if (fct_min >= abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z) .and. (p + 1) &
2322 & /tmp_num_procs_z >= num_stcls_min*recon_order) then
2323 num_procs_y = i
2324 num_procs_z = num_procs/i
2325 fct_min = abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2326 ierr = 0
2327 end if
2328 end if
2329 end do
2330 else
2331 if (cyl_coord .and. p > 0) then
2332 ! Pencil blocking for cylindrical coordinates (Fourier filter near axis)
2333
2334 ! Initial values of the processor factorization optimization
2335 num_procs_x = 1
2336 num_procs_y = num_procs
2337 num_procs_z = 1
2338 ierr = -1
2339
2340 ! Computing minimization variable for these initial values
2341 tmp_num_procs_x = num_procs_x
2342 tmp_num_procs_y = num_procs_y
2343 tmp_num_procs_z = num_procs_z
2344 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2345
2346 ! Searching for optimal computational domain distribution
2347 do i = 1, num_procs
2348 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2349 tmp_num_procs_x = i
2350 tmp_num_procs_y = num_procs/i
2351
2352 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) .and. (n + 1) &
2353 & /tmp_num_procs_y >= num_stcls_min*recon_order) then
2354 num_procs_x = i
2355 num_procs_y = num_procs/i
2356 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2357 ierr = 0
2358 end if
2359 end if
2360 end do
2361 else
2362 ! Initial estimate of optimal processor topology
2363 num_procs_x = 1
2364 num_procs_y = 1
2365 num_procs_z = num_procs
2366 ierr = -1
2367
2368 ! Benchmarking the quality of this initial guess
2369 tmp_num_procs_x = num_procs_x
2370 tmp_num_procs_y = num_procs_y
2371 tmp_num_procs_z = num_procs_z
2372 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + 10._wp*abs((n + 1) &
2373 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2374
2375 ! Optimization of the initial processor topology
2376 do i = 1, num_procs
2377 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2378 do j = 1, num_procs/i
2379 if (mod(num_procs/i, j) == 0 .and. (n + 1)/j >= num_stcls_min*recon_order) then
2380 tmp_num_procs_x = i
2381 tmp_num_procs_y = j
2382 tmp_num_procs_z = num_procs/(i*j)
2383
2384 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + abs((n + 1) &
2385 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z) .and. (p + 1) &
2386 & /tmp_num_procs_z >= num_stcls_min*recon_order) then
2387 num_procs_x = i
2388 num_procs_y = j
2389 num_procs_z = num_procs/(i*j)
2390 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + abs((n + 1) &
2391 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2392 ierr = 0
2393 end if
2394 end if
2395 end do
2396 end if
2397 end do
2398 end if
2399 end if
2400
2401 ! Verifying that a valid decomposition of the computational domain has been established. If not, the simulation
2402 ! exits.
2403 if (proc_rank == 0 .and. ierr == -1) then
2404 call s_mpi_abort('Unsupported combination of values ' // 'of num_procs, m, n, p and ' &
2405 & // 'weno/muscl/igr_order. Exiting.')
2406 end if
2407
2408 ! Creating new communicator using the Cartesian topology
2409 call mpi_cart_create(mpi_comm_world, 3, (/num_procs_x, num_procs_y, num_procs_z/), (/.true., .true., .true./), &
2410 & .false., mpi_comm_cart, ierr)
2411
2412 ! Finding the Cartesian coordinates of the local process
2413 call mpi_cart_coords(mpi_comm_cart, proc_rank, 3, proc_coords, ierr)
2414
2415 ! Global Parameters for z-direction
2416
2417 ! Number of remaining cells
2418 rem_cells = mod(p + 1, num_procs_z)
2419
2420 ! Optimal number of cells per processor
2421 p = (p + 1)/num_procs_z - 1
2422
2423 ! Distributing the remaining cells
2424 do i = 1, rem_cells
2425 if (proc_coords(3) == i - 1) then
2426 p = p + 1; exit
2427 end if
2428 end do
2429
2430 ! Boundary condition at the beginning
2431 if (proc_coords(3) > 0 .or. (bc_z%beg == bc_periodic .and. num_procs_z > 1)) then
2432 proc_coords(3) = proc_coords(3) - 1
2433 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_z%beg, ierr)
2434 proc_coords(3) = proc_coords(3) + 1
2435 end if
2436
2437 ! Boundary condition at the end
2438 if (proc_coords(3) < num_procs_z - 1 .or. (bc_z%end == bc_periodic .and. num_procs_z > 1)) then
2439 proc_coords(3) = proc_coords(3) + 1
2440 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_z%end, ierr)
2441 proc_coords(3) = proc_coords(3) - 1
2442 end if
2443
2444#ifdef MFC_POST_PROCESS
2445 ! Ghost zone at the beginning
2446 if (proc_coords(3) > 0 .and. format == 1) then
2447 offset_z%beg = 2
2448 else
2449 offset_z%beg = 0
2450 end if
2451
2452 ! Ghost zone at the end
2453 if (proc_coords(3) < num_procs_z - 1 .and. format == 1) then
2454 offset_z%end = 2
2455 else
2456 offset_z%end = 0
2457 end if
2458#endif
2459
2460 ! Beginning and end sub-domain boundary locations
2461 if (parallel_io) then
2462 if (proc_coords(3) < rem_cells) then
2463 start_idx(3) = (p + 1)*proc_coords(3)
2464 else
2465 start_idx(3) = (p + 1)*proc_coords(3) + rem_cells
2466 end if
2467 else
2468#ifdef MFC_PRE_PROCESS
2469 if (old_grid .neqv. .true.) then
2470 dz = (z_domain%end - z_domain%beg)/real(p_glb + 1, wp)
2471
2472 if (proc_coords(3) < rem_cells) then
2473 z_domain%beg = z_domain%beg + dz*real((p + 1)*proc_coords(3))
2474 z_domain%end = z_domain%end - dz*real((p + 1)*(num_procs_z - proc_coords(3) - 1) - (num_procs_z &
2475 & - rem_cells))
2476 else
2477 z_domain%beg = z_domain%beg + dz*real((p + 1)*proc_coords(3) + rem_cells)
2478 z_domain%end = z_domain%end - dz*real((p + 1)*(num_procs_z - proc_coords(3) - 1))
2479 end if
2480 end if
2481#endif
2482 end if
2483
2484 ! 2D Cartesian Processor Topology
2485 else
2486 ! Initial estimate of optimal processor topology
2487 num_procs_x = 1
2488 num_procs_y = num_procs
2489 ierr = -1
2490
2491 ! Benchmarking the quality of this initial guess
2492 tmp_num_procs_x = num_procs_x
2493 tmp_num_procs_y = num_procs_y
2494 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2495
2496 ! Optimization of the initial processor topology
2497 do i = 1, num_procs
2498 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2499 tmp_num_procs_x = i
2500 tmp_num_procs_y = num_procs/i
2501
2502 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) .and. (n + 1) &
2503 & /tmp_num_procs_y >= num_stcls_min*recon_order) then
2504 num_procs_x = i
2505 num_procs_y = num_procs/i
2506 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2507 ierr = 0
2508 end if
2509 end if
2510 end do
2511
2512 ! Verifying that a valid decomposition of the computational domain has been established. If not, the simulation
2513 ! exits.
2514 if (proc_rank == 0 .and. ierr == -1) then
2515 call s_mpi_abort('Unsupported combination of values ' // 'of num_procs, m, n and ' &
2516 & // 'weno/muscl/igr_order. Exiting.')
2517 end if
2518
2519 ! Creating new communicator using the Cartesian topology
2520 call mpi_cart_create(mpi_comm_world, 2, (/num_procs_x, num_procs_y/), (/.true., .true./), .false., mpi_comm_cart, &
2521 & ierr)
2522
2523 ! Finding the Cartesian coordinates of the local process
2524 call mpi_cart_coords(mpi_comm_cart, proc_rank, 2, proc_coords, ierr)
2525 end if
2526
2527 ! Global Parameters for y-direction
2528
2529 ! Number of remaining cells
2530 rem_cells = mod(n + 1, num_procs_y)
2531
2532 ! Optimal number of cells per processor
2533 n = (n + 1)/num_procs_y - 1
2534
2535 ! Distributing the remaining cells
2536 do i = 1, rem_cells
2537 if (proc_coords(2) == i - 1) then
2538 n = n + 1; exit
2539 end if
2540 end do
2541
2542 ! Boundary condition at the beginning
2543 if (proc_coords(2) > 0 .or. (bc_y%beg == bc_periodic .and. num_procs_y > 1)) then
2544 proc_coords(2) = proc_coords(2) - 1
2545 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_y%beg, ierr)
2546 proc_coords(2) = proc_coords(2) + 1
2547 end if
2548
2549 ! Boundary condition at the end
2550 if (proc_coords(2) < num_procs_y - 1 .or. (bc_y%end == bc_periodic .and. num_procs_y > 1)) then
2551 proc_coords(2) = proc_coords(2) + 1
2552 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_y%end, ierr)
2553 proc_coords(2) = proc_coords(2) - 1
2554 end if
2555
2556#ifdef MFC_POST_PROCESS
2557 ! Ghost zone at the beginning
2558 if (proc_coords(2) > 0 .and. format == 1) then
2559 offset_y%beg = 2
2560 else
2561 offset_y%beg = 0
2562 end if
2563
2564 ! Ghost zone at the end
2565 if (proc_coords(2) < num_procs_y - 1 .and. format == 1) then
2566 offset_y%end = 2
2567 else
2568 offset_y%end = 0
2569 end if
2570#endif
2571
2572 ! Beginning and end sub-domain boundary locations
2573 if (parallel_io) then
2574 if (proc_coords(2) < rem_cells) then
2575 start_idx(2) = (n + 1)*proc_coords(2)
2576 else
2577 start_idx(2) = (n + 1)*proc_coords(2) + rem_cells
2578 end if
2579 else
2580#ifdef MFC_PRE_PROCESS
2581 if (old_grid .neqv. .true.) then
2582 dy = (y_domain%end - y_domain%beg)/real(n_glb + 1, wp)
2583
2584 if (proc_coords(2) < rem_cells) then
2585 y_domain%beg = y_domain%beg + dy*real((n + 1)*proc_coords(2))
2586 y_domain%end = y_domain%end - dy*real((n + 1)*(num_procs_y - proc_coords(2) - 1) - (num_procs_y &
2587 & - rem_cells))
2588 else
2589 y_domain%beg = y_domain%beg + dy*real((n + 1)*proc_coords(2) + rem_cells)
2590 y_domain%end = y_domain%end - dy*real((n + 1)*(num_procs_y - proc_coords(2) - 1))
2591 end if
2592 end if
2593#endif
2594 end if
2595
2596 ! 1D Cartesian Processor Topology
2597 else
2598 ! Optimal processor topology
2599 num_procs_x = num_procs
2600
2601 ! Creating new communicator using the Cartesian topology
2602 call mpi_cart_create(mpi_comm_world, 1, (/num_procs_x/), (/.true./), .false., mpi_comm_cart, ierr)
2603
2604 ! Finding the Cartesian coordinates of the local process
2605 call mpi_cart_coords(mpi_comm_cart, proc_rank, 1, proc_coords, ierr)
2606 end if
2607
2608 ! Global Parameters for x-direction
2609
2610 ! Number of remaining cells
2611 rem_cells = mod(m + 1, num_procs_x)
2612
2613 ! Optimal number of cells per processor
2614 m = (m + 1)/num_procs_x - 1
2615
2616 ! Distributing the remaining cells
2617 do i = 1, rem_cells
2618 if (proc_coords(1) == i - 1) then
2619 m = m + 1; exit
2620 end if
2621 end do
2622
2623 call s_update_cell_bounds(cells_bounds, m, n, p)
2624
2625 ! Boundary condition at the beginning
2626 if (proc_coords(1) > 0 .or. (bc_x%beg == bc_periodic .and. num_procs_x > 1)) then
2627 proc_coords(1) = proc_coords(1) - 1
2628 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_x%beg, ierr)
2629 proc_coords(1) = proc_coords(1) + 1
2630 end if
2631
2632 ! Boundary condition at the end
2633 if (proc_coords(1) < num_procs_x - 1 .or. (bc_x%end == bc_periodic .and. num_procs_x > 1)) then
2634 proc_coords(1) = proc_coords(1) + 1
2635 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_x%end, ierr)
2636 proc_coords(1) = proc_coords(1) - 1
2637 end if
2638
2639#ifdef MFC_POST_PROCESS
2640 ! Ghost zone at the beginning
2641 if (proc_coords(1) > 0 .and. format == 1) then
2642 offset_x%beg = 2
2643 else
2644 offset_x%beg = 0
2645 end if
2646
2647 ! Ghost zone at the end
2648 if (proc_coords(1) < num_procs_x - 1 .and. format == 1) then
2649 offset_x%end = 2
2650 else
2651 offset_x%end = 0
2652 end if
2653#endif
2654
2655 ! Beginning and end sub-domain boundary locations
2656 if (parallel_io) then
2657 if (proc_coords(1) < rem_cells) then
2658 start_idx(1) = (m + 1)*proc_coords(1)
2659 else
2660 start_idx(1) = (m + 1)*proc_coords(1) + rem_cells
2661 end if
2662 else
2663#ifdef MFC_PRE_PROCESS
2664 if (old_grid .neqv. .true.) then
2665 dx = (x_domain%end - x_domain%beg)/real(m_glb + 1, wp)
2666
2667 if (proc_coords(1) < rem_cells) then
2668 x_domain%beg = x_domain%beg + dx*real((m + 1)*proc_coords(1))
2669 x_domain%end = x_domain%end - dx*real((m + 1)*(num_procs_x - proc_coords(1) - 1) - (num_procs_x - rem_cells))
2670 else
2671 x_domain%beg = x_domain%beg + dx*real((m + 1)*proc_coords(1) + rem_cells)
2672 x_domain%end = x_domain%end - dx*real((m + 1)*(num_procs_x - proc_coords(1) - 1))
2673 end if
2674 end if
2675#endif
2676 end if
2677#endif
2678
2680
2681 !> The goal of this procedure is to populate the buffers of the grid variables by communicating with the neighboring processors.
2682 !! Note that only the buffers of the cell-width distributions are handled in such a way. This is because the buffers of
2683 !! cell-boundary locations may be calculated directly from those of the cell-width distributions.
2684#ifndef MFC_PRE_PROCESS
2685 subroutine s_mpi_sendrecv_grid_variables_buffers(mpi_dir, pbc_loc)
2686
2687 integer, intent(in) :: mpi_dir
2688 integer, intent(in) :: pbc_loc
2689
2690#ifdef MFC_MPI
2691 integer :: ierr !< Generic flag used to identify and report MPI errors
2692
2693 if (mpi_dir == 1) then
2694 if (pbc_loc == -1) then ! PBC at the beginning
2695
2696 if (bc_x%end >= 0) then ! PBC at the beginning and end
2697 call mpi_sendrecv(dx(m - buff_size + 1), buff_size, mpi_p, bc_x%end, 0, dx(-buff_size), buff_size, mpi_p, &
2698 & bc_x%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2699 else ! PBC at the beginning only
2700 call mpi_sendrecv(dx(0), buff_size, mpi_p, bc_x%beg, 1, dx(-buff_size), buff_size, mpi_p, bc_x%beg, 0, &
2701 & mpi_comm_world, mpi_status_ignore, ierr)
2702 end if
2703 else ! PBC at the end
2704 if (bc_x%beg >= 0) then ! PBC at the end and beginning
2705 call mpi_sendrecv(dx(0), buff_size, mpi_p, bc_x%beg, 1, dx(m + 1), buff_size, mpi_p, bc_x%end, 1, &
2706 & mpi_comm_world, mpi_status_ignore, ierr)
2707 else ! PBC at the end only
2708 call mpi_sendrecv(dx(m - buff_size + 1), buff_size, mpi_p, bc_x%end, 0, dx(m + 1), buff_size, mpi_p, &
2709 & bc_x%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2710 end if
2711 end if
2712 else if (mpi_dir == 2) then
2713 if (pbc_loc == -1) then ! PBC at the beginning
2714
2715 if (bc_y%end >= 0) then ! PBC at the beginning and end
2716 call mpi_sendrecv(dy(n - buff_size + 1), buff_size, mpi_p, bc_y%end, 0, dy(-buff_size), buff_size, mpi_p, &
2717 & bc_y%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2718 else ! PBC at the beginning only
2719 call mpi_sendrecv(dy(0), buff_size, mpi_p, bc_y%beg, 1, dy(-buff_size), buff_size, mpi_p, bc_y%beg, 0, &
2720 & mpi_comm_world, mpi_status_ignore, ierr)
2721 end if
2722 else ! PBC at the end
2723 if (bc_y%beg >= 0) then ! PBC at the end and beginning
2724 call mpi_sendrecv(dy(0), buff_size, mpi_p, bc_y%beg, 1, dy(n + 1), buff_size, mpi_p, bc_y%end, 1, &
2725 & mpi_comm_world, mpi_status_ignore, ierr)
2726 else ! PBC at the end only
2727 call mpi_sendrecv(dy(n - buff_size + 1), buff_size, mpi_p, bc_y%end, 0, dy(n + 1), buff_size, mpi_p, &
2728 & bc_y%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2729 end if
2730 end if
2731 else
2732 if (pbc_loc == -1) then ! PBC at the beginning
2733
2734 if (bc_z%end >= 0) then ! PBC at the beginning and end
2735 call mpi_sendrecv(dz(p - buff_size + 1), buff_size, mpi_p, bc_z%end, 0, dz(-buff_size), buff_size, mpi_p, &
2736 & bc_z%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2737 else ! PBC at the beginning only
2738 call mpi_sendrecv(dz(0), buff_size, mpi_p, bc_z%beg, 1, dz(-buff_size), buff_size, mpi_p, bc_z%beg, 0, &
2739 & mpi_comm_world, mpi_status_ignore, ierr)
2740 end if
2741 else ! PBC at the end
2742 if (bc_z%beg >= 0) then ! PBC at the end and beginning
2743 call mpi_sendrecv(dz(0), buff_size, mpi_p, bc_z%beg, 1, dz(p + 1), buff_size, mpi_p, bc_z%end, 1, &
2744 & mpi_comm_world, mpi_status_ignore, ierr)
2745 else ! PBC at the end only
2746 call mpi_sendrecv(dz(p - buff_size + 1), buff_size, mpi_p, bc_z%end, 0, dz(p + 1), buff_size, mpi_p, &
2747 & bc_z%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2748 end if
2749 end if
2750 end if
2751#endif
2752
2754#endif
2755
2756 !> Module deallocation and/or disassociation procedures
2758
2759#ifdef MFC_MPI
2760 deallocate (buff_send, buff_recv)
2761#endif
2762
2763 end subroutine s_finalize_mpi_common_module
2764
2765end module m_mpi_common
type(scalar_field), dimension(sys_size), intent(inout) q_cons_vf
integer, intent(in) j
Shared derived types for field data, patch geometry, bubble dynamics, and MPI I/O structures.
Global parameters for the computational domain, fluid properties, and simulation algorithm configurat...
logical, parameter chemistry
Chemistry modeling.
integer sys_size
Number of unknowns in system of eqns.
integer buff_size
Number of ghost cells for boundary condition storage.
type(chemistry_parameters) chem_params
type(cell_num_bounds) cells_bounds
Utility routines for bubble model setup, coordinate transforms, array sampling, and special functions...
MPI communication layer: domain decomposition, halo exchange, reductions, and parallel I/O setup.
impure subroutine s_mpi_abort(prnt, code)
The subroutine terminates the MPI execution environment.
impure subroutine s_initialize_mpi_common_module
Initialize the module.
impure subroutine s_mpi_gather_data(my_vector, counts, gathered_vector, root)
Gather variable-length real vectors from all MPI ranks onto the root process.
impure subroutine s_mpi_barrier
Halts all processes until all have reached barrier.
impure subroutine s_mpi_initialize
Initialize the MPI execution environment and query the number of processors and local rank.
impure subroutine s_mpi_allreduce_vectors_sum(var_loc, var_glb, num_vectors, vector_length)
Reduce an array of vectors to their global sums across all MPI ranks.
real(wp), dimension(:), allocatable, private buff_recv
Primitive variable receive buffer for halo exchange.
impure subroutine s_initialize_mpi_data(q_cons_vf, ib_markers, beta)
Set up MPI I/O data views and variable pointers for parallel file output.
impure subroutine s_mpi_reduce_maxloc(var_loc)
Reduce a 2-element variable to its global maximum value with the owning processor rank (MPI_MAXLOC)....
subroutine s_mpi_sendrecv_grid_variables_buffers(mpi_dir, pbc_loc)
The goal of this procedure is to populate the buffers of the grid variables by communicating with the...
impure subroutine s_mpi_reduce_stability_criteria_extrema(icfl_max_loc, vcfl_max_loc, rc_min_loc, icfl_max_glb, vcfl_max_glb, rc_min_glb)
The goal of this subroutine is to determine the global extrema of the stability criteria in the compu...
impure subroutine s_mpi_allreduce_sum(var_loc, var_glb)
Reduce a local real value to its global sum across all MPI ranks.
real(wp), dimension(:), allocatable, private buff_send
Primitive variable send buffer for halo exchange.
impure subroutine s_mpi_allreduce_min(var_loc, var_glb)
Reduce a local real value to its global minimum across all MPI ranks.
impure subroutine s_prohibit_abort(condition, message)
Print a case file error with the prohibited condition and message, then abort execution.
impure subroutine s_mpi_finalize
The subroutine finalizes the MPI execution environment.
subroutine s_initialize_mpi_data_ds(q_cons_vf)
Set up MPI I/O data views for downsampled (coarsened) parallel file output.
impure subroutine s_mpi_allreduce_max(var_loc, var_glb)
Reduce a local real value to its global maximum across all MPI ranks.
impure subroutine s_mpi_allreduce_integer_sum(var_loc, var_glb)
Reduce a local integer value to its global sum across all MPI ranks.
subroutine s_mpi_sendrecv_variables_buffers(q_comm, mpi_dir, pbc_loc, nvar, pb_in, mv_in, q_t_sf)
The goal of this procedure is to populate the buffers of the cell-average conservative variables by c...
integer, private v_size
impure subroutine mpi_bcast_time_step_values(proc_time, time_avg)
Gather per-rank time step wall-clock times onto rank 0 for performance reporting.
impure subroutine s_mpi_reduce_min(var_loc)
Reduce a local real value to its global minimum across all ranks.
impure subroutine s_finalize_mpi_common_module
Module deallocation and/or disassociation procedures.
integer(kind=8) halo_size
subroutine s_mpi_decompose_computational_domain
Decompose the computational domain among processors by balancing cells per rank in each coordinate di...
NVIDIA NVTX profiling API bindings for GPU performance instrumentation.
Definition m_nvtx.f90:6