MFC
Exascale flow solver
Loading...
Searching...
No Matches
m_mpi_common.fpp.f90
Go to the documentation of this file.
1# 1 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2!>
3!! @file
4!! @brief Contains module m_mpi_common
5
6# 1 "/home/runner/work/MFC/MFC/src/common/include/case.fpp" 1
7! This file exists so that Fypp can be run without generating case.fpp files for
8! each target. This is useful when generating documentation, for example. This
9! should also let MFC be built with CMake directly, without invoking mfc.sh.
10
11! For pre-process.
12# 8 "/home/runner/work/MFC/MFC/src/common/include/case.fpp"
13
14! For moving immersed boundaries in simulation
15# 12 "/home/runner/work/MFC/MFC/src/common/include/case.fpp"
16# 6 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp" 2
17# 1 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp" 1
18# 1 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 1
19# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
20# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
21# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
22# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
23# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
24# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
25
26# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
27# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
28# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
29
30# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
31
32# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
33
34# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
35
36# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
37
38# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
39
40# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
41
42# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
43! New line at end of file is required for FYPP
44# 2 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
45# 1 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp" 1
46# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
47# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
48# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
49# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
50# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
51# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
52
53# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
54# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
55# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
56
57# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
58
59# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
60
61# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
62
63# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
64
65# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
66
67# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
68
69# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
70! New line at end of file is required for FYPP
71# 2 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp" 2
72
73# 4 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
74# 5 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
75# 6 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
76# 7 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
77# 8 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
78
79# 20 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
80
81# 43 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
82
83# 48 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
84
85# 53 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
86
87# 58 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
88
89# 63 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
90
91# 68 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
92
93# 76 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
94
95# 81 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
96
97# 86 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
98
99# 91 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
100
101# 96 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
102
103# 101 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
104
105# 106 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
106
107# 111 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
108
109# 116 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
110
111# 121 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
112
113# 151 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
114
115# 192 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
116
117# 206 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
118
119# 231 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
120
121# 242 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
122
123# 244 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
124# 255 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
125
126# 284 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
127
128# 294 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
129
130# 304 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
131
132# 313 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
133
134# 330 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
135
136# 340 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
137
138# 347 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
139
140# 353 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
141
142# 359 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
143
144# 365 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
145
146# 371 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
147
148# 377 "/home/runner/work/MFC/MFC/src/common/include/omp_macros.fpp"
149! New line at end of file is required for FYPP
150# 3 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
151# 1 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp" 1
152# 1 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp" 1
153# 2 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
154# 3 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
155# 4 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
156# 5 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
157# 6 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
158
159# 8 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
160# 9 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
161# 10 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
162
163# 17 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
164
165# 46 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
166
167# 58 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
168
169# 68 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
170
171# 98 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
172
173# 110 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
174
175# 120 "/home/runner/work/MFC/MFC/src/common/include/shared_parallel_macros.fpp"
176! New line at end of file is required for FYPP
177# 2 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp" 2
178
179# 7 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
180
181# 17 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
182
183# 22 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
184
185# 27 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
186
187# 32 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
188
189# 37 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
190
191# 42 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
192
193# 47 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
194
195# 52 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
196
197# 57 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
198
199# 62 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
200
201# 73 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
202
203# 78 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
204
205# 83 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
206
207# 88 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
208
209# 103 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
210
211# 131 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
212
213# 160 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
214
215# 175 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
216
217# 193 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
218
219# 215 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
220
221# 244 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
222
223# 259 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
224
225# 269 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
226
227# 278 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
228
229# 294 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
230
231# 304 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
232
233# 311 "/home/runner/work/MFC/MFC/src/common/include/acc_macros.fpp"
234! New line at end of file is required for FYPP
235# 4 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp" 2
236
237! GPU parallel region (scalar reductions, maxval/minval)
238# 23 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
239
240! GPU parallel loop over threads (most common GPU macro)
241# 43 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
242
243! Required closing for GPU_PARALLEL_LOOP
244# 55 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
245
246! Mark routine for device compilation
247# 112 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
248
249! Declare device-resident data
250# 130 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
251
252! Inner loop within a GPU parallel region
253# 145 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
254
255! Scoped GPU data region
256# 164 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
257
258! Host code with device pointers (for MPI with GPU buffers)
259# 193 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
260
261! Allocate device memory (unscoped)
262# 207 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
263
264! Free device memory
265# 219 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
266
267! Atomic operation on device
268# 231 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
269
270! End atomic capture block
271# 242 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
272
273! Copy data between host and device
274# 254 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
275
276! Synchronization barrier
277# 266 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
278
279! Import GPU library module (openacc or omp_lib)
280# 275 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
281
282! Emit code only for AMD compiler
283# 282 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
284
285! Emit code for non-Cray compilers
286# 289 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
287
288! Emit code only for Cray compiler
289# 296 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
290
291! Emit code for non-NVIDIA compilers
292# 303 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
293
294# 305 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
295# 306 "/home/runner/work/MFC/MFC/src/common/include/parallel_macros.fpp"
296! New line at end of file is required for FYPP
297# 2 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp" 2
298
299# 14 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
300
301! Caution: This macro requires the use of a binding script to set CUDA_VISIBLE_DEVICES, such that we have one GPU device per MPI
302! rank. That's because for both cudaMemAdvise (preferred location) and cudaMemPrefetchAsync we use location = device_id = 0. For an
303! example see misc/nvidia_uvm/bind.sh. NVIDIA unified memory page placement hint
304# 57 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
305
306! Allocate and create GPU device memory
307# 77 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
308
309! Free GPU device memory and deallocate
310# 85 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
311
312! Cray-specific GPU pointer setup for vector fields
313# 109 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
314
315! Cray-specific GPU pointer setup for scalar fields
316# 125 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
317
318! Cray-specific GPU pointer setup for acoustic source spatials
319# 150 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
320
321# 156 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
322
323# 163 "/home/runner/work/MFC/MFC/src/common/include/macros.fpp"
324! New line at end of file is required for FYPP
325# 7 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp" 2
326
327!> @brief MPI communication layer: domain decomposition, halo exchange, reductions, and parallel I/O setup
329
330#ifdef MFC_MPI
331 use mpi !< message passing interface (mpi) module
332#endif
333
336 use m_helper
337 use ieee_arithmetic
338 use m_nvtx
340
341 implicit none
342
343 integer, private :: v_size
344
345# 25 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
346#if defined(MFC_OpenACC)
347# 25 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
348!$acc declare create(v_size)
349# 25 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
350#elif defined(MFC_OpenMP)
351# 25 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
352!$omp declare target (v_size)
353# 25 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
354#endif
355
356 real(wp), private, allocatable, dimension(:) :: buff_send !< Primitive variable send buffer for halo exchange
357 real(wp), private, allocatable, dimension(:) :: buff_recv !< Primitive variable receive buffer for halo exchange
358#ifndef __NVCOMPILER_GPU_UNIFIED_MEM
359
360# 30 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
361#if defined(MFC_OpenACC)
362# 30 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
363!$acc declare create(buff_send, buff_recv)
364# 30 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
365#elif defined(MFC_OpenMP)
366# 30 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
367!$omp declare target (buff_send, buff_recv)
368# 30 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
369#endif
370#endif
371
372 integer(kind=8) :: halo_size
373
374# 34 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
375#if defined(MFC_OpenACC)
376# 34 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
377!$acc declare create(halo_size)
378# 34 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
379#elif defined(MFC_OpenMP)
380# 34 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
381!$omp declare target (halo_size)
382# 34 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
383#endif
384
385contains
386
387 !> Initialize the module.
389
390#ifdef MFC_MPI
391 ! Allocating buff_send/recv and. Please note that for the sake of simplicity, both variables are provided sufficient storage
392 ! to hold the largest buffer in the computational domain.
393
394 if (qbmm .and. .not. polytropic) then
395 v_size = sys_size + 2*nb*nnode
396 else if (chemistry .and. chem_params%diffusion) then
397 v_size = sys_size + 1
398 else
399 v_size = sys_size
400 end if
401
402 if (n > 0) then
403 if (p > 0) then
404 halo_size = nint(-1._wp + 1._wp*buff_size*(v_size)*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)*(p + 2*buff_size &
405 & + 1)/(cells_bounds%mnp_min + 2*buff_size + 1))
406 else
407 halo_size = -1 + buff_size*(v_size)*(cells_bounds%mn_max + 2*buff_size + 1)
408 end if
409 else
410 halo_size = -1 + buff_size*(v_size)
411 end if
412
413
414# 64 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
415#if defined(MFC_OpenACC)
416# 64 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
417!$acc update device(halo_size, v_size)
418# 64 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
419#elif defined(MFC_OpenMP)
420# 64 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
421!$omp target update to(halo_size, v_size)
422# 64 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
423#endif
424
425#ifndef __NVCOMPILER_GPU_UNIFIED_MEM
426#ifdef MFC_DEBUG
427# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
428 block
429# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
430 use iso_fortran_env, only: output_unit
431# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
432
433# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
434 print *, 'm_mpi_common.fpp:67: ', '@:ALLOCATE(buff_send(0:halo_size), buff_recv(0:halo_size))'
435# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
436
437# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
438 call flush (output_unit)
439# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
440 end block
441# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
442#endif
443# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
444 allocate (buff_send(0:halo_size), buff_recv(0:halo_size))
445# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
446
447# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
448
449# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
450
451# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
452#if defined(MFC_OpenACC)
453# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
454!$acc enter data create(buff_send, buff_recv)
455# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
456#elif defined(MFC_OpenMP)
457# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
458!$omp target enter data map(always,alloc:buff_send, buff_recv)
459# 67 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
460#endif
461#else
462 allocate (buff_send(0:halo_size), buff_recv(0:halo_size))
463
464# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
465#if defined(MFC_OpenACC)
466# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
467!$acc enter data create(capture:buff_send)
468# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
469#elif defined(MFC_OpenMP)
470# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
471!$omp target enter data map(always,alloc:capture:buff_send)
472# 70 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
473#endif
474
475# 71 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
476#if defined(MFC_OpenACC)
477# 71 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
478!$acc enter data create(capture:buff_recv)
479# 71 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
480#elif defined(MFC_OpenMP)
481# 71 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
482!$omp target enter data map(always,alloc:capture:buff_recv)
483# 71 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
484#endif
485#endif
486#endif
487
488 end subroutine s_initialize_mpi_common_module
489
490 !> Initialize the MPI execution environment and query the number of processors and local rank.
491 impure subroutine s_mpi_initialize
492
493#ifdef MFC_MPI
494 integer :: ierr !< Generic flag used to identify and report MPI errors
495
496 call mpi_init(ierr)
497
498 if (ierr /= mpi_success) then
499 print '(A)', 'Unable to initialize MPI environment. Exiting.'
500 call mpi_abort(mpi_comm_world, 1, ierr)
501 end if
502
503 call mpi_comm_size(mpi_comm_world, num_procs, ierr)
504
505 call mpi_comm_rank(mpi_comm_world, proc_rank, ierr)
506#else
507 num_procs = 1
508 proc_rank = 0
509#endif
510
511
512# 98 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
513#if defined(MFC_OpenACC)
514# 98 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
515!$acc update device(num_procs, proc_rank)
516# 98 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
517#elif defined(MFC_OpenMP)
518# 98 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
519!$omp target update to(num_procs, proc_rank)
520# 98 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
521#endif
522
523 end subroutine s_mpi_initialize
524
525 !> Set up MPI I/O data views and variable pointers for parallel file output.
526 impure subroutine s_initialize_mpi_data(q_cons_vf, ib_markers, beta)
527
528 type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
529 type(integer_field), optional, intent(in) :: ib_markers
530 type(scalar_field), intent(in), optional :: beta
531 integer, dimension(num_dims) :: sizes_glb, sizes_loc
532
533#ifdef MFC_MPI
534 integer :: i, j
535 integer :: ierr !< Generic flag used to identify and report MPI errors
536 integer :: alt_sys
537
538 if (present(beta)) then
539 alt_sys = sys_size + 1
540 else
541 alt_sys = sys_size
542 end if
543
544 do i = 1, sys_size
545 mpi_io_data%var(i)%sf => q_cons_vf(i)%sf(0:m,0:n,0:p)
546 end do
547
548 if (present(beta)) then
549 mpi_io_data%var(alt_sys)%sf => beta%sf(0:m,0:n,0:p)
550 end if
551
552 ! Additional variables pb and mv for non-polytropic qbmm
553 if (qbmm .and. .not. polytropic) then
554 do i = 1, nb
555 do j = 1, nnode
556#ifdef MFC_PRE_PROCESS
557 mpi_io_data%var(sys_size + (i - 1)*nnode + j)%sf => pb%sf(0:m,0:n,0:p,j, i)
558 mpi_io_data%var(sys_size + (i - 1)*nnode + j + nb*nnode)%sf => mv%sf(0:m,0:n,0:p,j, i)
559#elif defined (MFC_SIMULATION)
560 mpi_io_data%var(sys_size + (i - 1)*nnode + j)%sf => pb_ts(1)%sf(0:m,0:n,0:p,j, i)
561 mpi_io_data%var(sys_size + (i - 1)*nnode + j + nb*nnode)%sf => mv_ts(1)%sf(0:m,0:n,0:p,j, i)
562#endif
563 end do
564 end do
565 end if
566
567 ! Define global(g) and local(l) sizes for flow variables
568 sizes_glb(1) = m_glb + 1; sizes_loc(1) = m + 1
569 if (n > 0) then
570 sizes_glb(2) = n_glb + 1; sizes_loc(2) = n + 1
571 if (p > 0) then
572 sizes_glb(num_dims) = p_glb + 1; sizes_loc(num_dims) = p + 1
573 end if
574 end if
575
576 ! Define the view for each variable
577 do i = 1, alt_sys
578 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_p, &
579 & mpi_io_data%view(i), ierr)
580 call mpi_type_commit(mpi_io_data%view(i), ierr)
581 end do
582
583#ifndef MFC_POST_PROCESS
584 if (qbmm .and. .not. polytropic) then
585 do i = sys_size + 1, sys_size + 2*nb*nnode
586 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_p, &
587 & mpi_io_data%view(i), ierr)
588 call mpi_type_commit(mpi_io_data%view(i), ierr)
589 end do
590 end if
591#endif
592
593#ifndef MFC_PRE_PROCESS
594 if (present(ib_markers)) then
595 mpi_io_ib_data%var%sf => ib_markers%sf(0:m,0:n,0:p)
596
597 call mpi_type_create_subarray(num_dims, sizes_glb, sizes_loc, start_idx, mpi_order_fortran, mpi_integer, &
598 & mpi_io_ib_data%view, ierr)
599 call mpi_type_commit(mpi_io_ib_data%view, ierr)
600 end if
601#endif
602#endif
603
604 end subroutine s_initialize_mpi_data
605
606 !> Set up MPI I/O data views for downsampled (coarsened) parallel file output.
607 subroutine s_initialize_mpi_data_ds(q_cons_vf)
608
609 type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
610 integer, dimension(num_dims) :: sizes_loc
611 integer, dimension(3) :: sf_start_idx
612
613#ifdef MFC_MPI
614 integer :: i, m_ds, n_ds, p_ds, ierr
615
616 sf_start_idx = (/0, 0, 0/)
617
618#ifndef MFC_POST_PROCESS
619 m_ds = int((m + 1)/3) - 1
620 n_ds = int((n + 1)/3) - 1
621 p_ds = int((p + 1)/3) - 1
622#else
623 m_ds = m
624 n_ds = n
625 p_ds = p
626#endif
627
628#ifdef MFC_POST_PROCESS
629 do i = 1, sys_size
630 mpi_io_data%var(i)%sf => q_cons_vf(i)%sf(-1:m_ds + 1,-1:n_ds + 1,-1:p_ds + 1)
631 end do
632#endif
633 ! Define global(g) and local(l) sizes for flow variables
634 sizes_loc(1) = m_ds + 3
635 if (n > 0) then
636 sizes_loc(2) = n_ds + 3
637 if (p > 0) then
638 sizes_loc(num_dims) = p_ds + 3
639 end if
640 end if
641
642 ! Define the view for each variable
643 do i = 1, sys_size
644 call mpi_type_create_subarray(num_dims, sizes_loc, sizes_loc, sf_start_idx, mpi_order_fortran, mpi_p, &
645 & mpi_io_data%view(i), ierr)
646 call mpi_type_commit(mpi_io_data%view(i), ierr)
647 end do
648#endif
649
650 end subroutine s_initialize_mpi_data_ds
651
652 !> Gather variable-length real vectors from all MPI ranks onto the root process.
653 impure subroutine s_mpi_gather_data(my_vector, counts, gathered_vector, root)
654
655 integer, intent(in) :: counts !< Array of vector lengths for each process
656 real(wp), intent(in), dimension(counts) :: my_vector !< Input vector on each process
657 integer, intent(in) :: root !< Rank of the root process
658 real(wp), allocatable, intent(out) :: gathered_vector(:) !< Gathered vector on the root process
659 integer :: i
660 integer :: ierr !< Generic flag used to identify and report MPI errors
661 integer, allocatable :: recounts(:), displs(:)
662
663#ifdef MFC_MPI
664 allocate (recounts(num_procs))
665
666 call mpi_gather(counts, 1, mpi_integer, recounts, 1, mpi_integer, root, mpi_comm_world, ierr)
667
668 allocate (displs(size(recounts)))
669
670 displs(1) = 0
671
672 do i = 2, size(recounts)
673 displs(i) = displs(i - 1) + recounts(i - 1)
674 end do
675
676 allocate (gathered_vector(sum(recounts)))
677 call mpi_gatherv(my_vector, counts, mpi_p, gathered_vector, recounts, displs, mpi_p, root, mpi_comm_world, ierr)
678#endif
679
680 end subroutine s_mpi_gather_data
681
682 !> Gather per-rank time step wall-clock times onto rank 0 for performance reporting.
683 impure subroutine mpi_bcast_time_step_values(proc_time, time_avg)
684
685 real(wp), dimension(0:num_procs - 1), intent(inout) :: proc_time
686 real(wp), intent(inout) :: time_avg
687
688#ifdef MFC_MPI
689 integer :: ierr !< Generic flag used to identify and report MPI errors
690
691 call mpi_gather(time_avg, 1, mpi_p, proc_time(0), 1, mpi_p, 0, mpi_comm_world, ierr)
692#endif
693
694 end subroutine mpi_bcast_time_step_values
695
696 !> Print a case file error with the prohibited condition and message, then abort execution.
697 impure subroutine s_prohibit_abort(condition, message)
698
699 character(len=*), intent(in) :: condition, message
700
701 print *, ""
702 print *, "CASE FILE ERROR"
703 print *, " - Prohibited condition: ", trim(condition)
704 if (len_trim(message) > 0) then
705 print *, " - Note: ", trim(message)
706 end if
707 print *, ""
708 call s_mpi_abort(code=case_file_error_code)
709
710 end subroutine s_prohibit_abort
711
712 !> The goal of this subroutine is to determine the global extrema of the stability criteria in the computational domain. This is
713 !! performed by sifting through the local extrema of each stability criterion. Note that each of the local extrema is from a
714 !! single process, within its assigned section of the computational domain. Finally, note that the global extrema values are
715 !! only bookkeept on the rank 0 processor.
716 impure subroutine s_mpi_reduce_stability_criteria_extrema(icfl_max_loc, vcfl_max_loc, Rc_min_loc, icfl_max_glb, vcfl_max_glb, &
717
718 & Rc_min_glb)
719
720 real(wp), intent(in) :: icfl_max_loc
721 real(wp), intent(in) :: vcfl_max_loc
722 real(wp), intent(in) :: rc_min_loc
723 real(wp), intent(out) :: icfl_max_glb
724 real(wp), intent(out) :: vcfl_max_glb
725 real(wp), intent(out) :: rc_min_glb
726
727 icfl_max_glb = icfl_max_loc
728 vcfl_max_glb = vcfl_max_loc
729 rc_min_glb = rc_min_loc
730
731#ifdef MFC_SIMULATION
732#ifdef MFC_MPI
733 block
734 integer :: ierr
735
736 call mpi_reduce(icfl_max_loc, icfl_max_glb, 1, mpi_p, mpi_max, 0, mpi_comm_world, ierr)
737
738 if (viscous) then
739 call mpi_reduce(vcfl_max_loc, vcfl_max_glb, 1, mpi_p, mpi_max, 0, mpi_comm_world, ierr)
740 call mpi_reduce(rc_min_loc, rc_min_glb, 1, mpi_p, mpi_min, 0, mpi_comm_world, ierr)
741 end if
742 end block
743#else
744 icfl_max_glb = icfl_max_loc
745
746 if (viscous) then
747 vcfl_max_glb = vcfl_max_loc
748 rc_min_glb = rc_min_loc
749 end if
750#endif
751#endif
752
754
755 !> Reduce a local real value to its global sum across all MPI ranks.
756 impure subroutine s_mpi_allreduce_sum(var_loc, var_glb)
757
758 real(wp), intent(in) :: var_loc
759 real(wp), intent(out) :: var_glb
760
761#ifdef MFC_MPI
762 integer :: ierr !< Generic flag used to identify and report MPI errors
763
764 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_sum, mpi_comm_world, ierr)
765#endif
766
767 end subroutine s_mpi_allreduce_sum
768
769 !> Reduce an array of vectors to their global sums across all MPI ranks.
770 impure subroutine s_mpi_allreduce_vectors_sum(var_loc, var_glb, num_vectors, vector_length)
771
772 integer, intent(in) :: num_vectors, vector_length
773 real(wp), dimension(:,:), intent(in) :: var_loc
774 real(wp), dimension(:,:), intent(inout) :: var_glb
775
776#ifdef MFC_MPI
777 integer :: ierr !< Generic flag used to identify and report MPI errors
778
779 if (loc(var_loc) == loc(var_glb)) then
780 call mpi_allreduce(mpi_in_place, var_glb, num_vectors*vector_length, mpi_p, mpi_sum, mpi_comm_world, ierr)
781 else
782 call mpi_allreduce(var_loc, var_glb, num_vectors*vector_length, mpi_p, mpi_sum, mpi_comm_world, ierr)
783 end if
784#else
785 var_glb(1:num_vectors,1:vector_length) = var_loc(1:num_vectors,1:vector_length)
786#endif
787
788 end subroutine s_mpi_allreduce_vectors_sum
789
790 !> Reduce a local integer value to its global sum across all MPI ranks.
791 impure subroutine s_mpi_allreduce_integer_sum(var_loc, var_glb)
792
793 integer, intent(in) :: var_loc
794 integer, intent(out) :: var_glb
795
796#ifdef MFC_MPI
797 integer :: ierr !< Generic flag used to identify and report MPI errors
798
799 call mpi_allreduce(var_loc, var_glb, 1, mpi_integer, mpi_sum, mpi_comm_world, ierr)
800#else
801 var_glb = var_loc
802#endif
803
804 end subroutine s_mpi_allreduce_integer_sum
805
806 !> Reduce a local real value to its global minimum across all MPI ranks.
807 impure subroutine s_mpi_allreduce_min(var_loc, var_glb)
808
809 real(wp), intent(in) :: var_loc
810 real(wp), intent(out) :: var_glb
811
812#ifdef MFC_MPI
813 integer :: ierr !< Generic flag used to identify and report MPI errors
814
815 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_min, mpi_comm_world, ierr)
816#endif
817
818 end subroutine s_mpi_allreduce_min
819
820 !> Reduce a local real value to its global maximum across all MPI ranks.
821 impure subroutine s_mpi_allreduce_max(var_loc, var_glb)
822
823 real(wp), intent(in) :: var_loc
824 real(wp), intent(out) :: var_glb
825
826#ifdef MFC_MPI
827 integer :: ierr !< Generic flag used to identify and report MPI errors
828
829 call mpi_allreduce(var_loc, var_glb, 1, mpi_p, mpi_max, mpi_comm_world, ierr)
830#endif
831
832 end subroutine s_mpi_allreduce_max
833
834 !> Reduce a local real value to its global minimum across all ranks
835 impure subroutine s_mpi_reduce_min(var_loc)
836
837 real(wp), intent(inout) :: var_loc
838
839#ifdef MFC_MPI
840 integer :: ierr !< Generic flag used to identify and report MPI errors
841 real(wp) :: var_glb
842
843 call mpi_reduce(var_loc, var_glb, 1, mpi_p, mpi_min, 0, mpi_comm_world, ierr)
844
845 call mpi_bcast(var_glb, 1, mpi_p, 0, mpi_comm_world, ierr)
846
847 var_loc = var_glb
848#endif
849
850 end subroutine s_mpi_reduce_min
851
852 !> Reduce a 2-element variable to its global maximum value with the owning processor rank (MPI_MAXLOC).
853 !> Reduce a local value to its global maximum with location (rank) across all ranks
854 impure subroutine s_mpi_reduce_maxloc(var_loc)
855
856 real(wp), dimension(2), intent(inout) :: var_loc
857
858#ifdef MFC_MPI
859 integer :: ierr !< Generic flag used to identify and report MPI errors
860 real(wp), dimension(2) :: var_glb !< Reduced (max value, rank) pair
861 call mpi_reduce(var_loc, var_glb, 1, mpi_2p, mpi_maxloc, 0, mpi_comm_world, ierr)
862
863 call mpi_bcast(var_glb, 1, mpi_2p, 0, mpi_comm_world, ierr)
864
865 var_loc = var_glb
866#endif
867
868 end subroutine s_mpi_reduce_maxloc
869
870 !> The subroutine terminates the MPI execution environment.
871 impure subroutine s_mpi_abort(prnt, code)
872
873 character(len=*), intent(in), optional :: prnt
874 integer, intent(in), optional :: code
875
876#ifdef MFC_MPI
877 integer :: ierr !< Generic flag used to identify and report MPI errors
878#endif
879
880 if (present(prnt)) then
881 print *, prnt
882 call flush (6)
883 end if
884
885#ifndef MFC_MPI
886 if (present(code)) then
887 stop code
888 else
889 stop 1
890 end if
891#else
892 if (present(code)) then
893 call mpi_abort(mpi_comm_world, code, ierr)
894 else
895 call mpi_abort(mpi_comm_world, 1, ierr)
896 end if
897#endif
898
899 end subroutine s_mpi_abort
900
901 !> Halts all processes until all have reached barrier.
902 impure subroutine s_mpi_barrier
903
904#ifdef MFC_MPI
905 integer :: ierr !< Generic flag used to identify and report MPI errors
906
907 call mpi_barrier(mpi_comm_world, ierr)
908#endif
909
910 end subroutine s_mpi_barrier
911
912 !> The subroutine finalizes the MPI execution environment.
913 impure subroutine s_mpi_finalize
914
915#ifdef MFC_MPI
916 integer :: ierr !< Generic flag used to identify and report MPI errors
917
918 call mpi_finalize(ierr)
919#endif
920
921 end subroutine s_mpi_finalize
922
923 !> The goal of this procedure is to populate the buffers of the cell-average conservative variables by communicating with the
924 !! neighboring processors.
925 subroutine s_mpi_sendrecv_variables_buffers(q_comm, mpi_dir, pbc_loc, nVar, pb_in, mv_in, q_T_sf)
926
927 type(scalar_field), dimension(1:), intent(inout) :: q_comm
928 real(stp), optional, dimension(idwbuff(1)%beg:,idwbuff(2)%beg:,idwbuff(3)%beg:,1:,1:), intent(inout) :: pb_in, mv_in
929 integer, intent(in) :: mpi_dir, pbc_loc, nVar
930 integer :: i, j, k, l, r, q !< Generic loop iterators
931 integer :: buffer_counts(1:3), buffer_count
932 type(int_bounds_info) :: boundary_conditions(1:3)
933 integer :: beg_end(1:2), grid_dims(1:3)
934 integer :: dst_proc, src_proc, recv_tag, send_tag
935 logical :: beg_end_geq_0, qbmm_comm, chem_diff_comm
936 integer :: pack_offset, unpack_offset
937 type(scalar_field), optional, intent(inout) :: q_T_sf
938
939#ifdef MFC_MPI
940 integer :: ierr !< Generic flag used to identify and report MPI errors
941
942 call nvtxstartrange("RHS-COMM-PACKBUF")
943
944 qbmm_comm = .false.
945 chem_diff_comm = .false.
946
947 if (present(pb_in) .and. present(mv_in) .and. qbmm .and. .not. polytropic) then
948 qbmm_comm = .true.
949 v_size = nvar + 2*nb*nnode
950 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
951 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
952 else if (present(q_t_sf) .and. chemistry .and. chem_params%diffusion) then
953 chem_diff_comm = .true.
954 v_size = nvar + 1
955 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
956 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
957 else
958 v_size = nvar
959 buffer_counts = (/buff_size*v_size*(n + 1)*(p + 1), buff_size*v_size*(m + 2*buff_size + 1)*(p + 1), &
960 & buff_size*v_size*(m + 2*buff_size + 1)*(n + 2*buff_size + 1)/)
961 end if
962
963
964# 540 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
965#if defined(MFC_OpenACC)
966# 540 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
967!$acc update device(v_size)
968# 540 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
969#elif defined(MFC_OpenMP)
970# 540 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
971!$omp target update to(v_size)
972# 540 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
973#endif
974
975 buffer_count = buffer_counts(mpi_dir)
976 boundary_conditions = (/bc_x, bc_y, bc_z/)
977 beg_end = (/boundary_conditions(mpi_dir)%beg, boundary_conditions(mpi_dir)%end/)
978 beg_end_geq_0 = beg_end(max(pbc_loc, 0) - pbc_loc + 1) >= 0
979
980 ! Implements: pbc_loc bc_x >= 0 -> [send/recv]_tag [dst/src]_proc -1 (=0) 0 -> [1,0] [0,0] | 0 0 [1,0] [beg,beg] -1 (=0) 1
981 ! -> [0,0] [1,0] | 0 1 [0,0] [end,beg] +1 (=1) 0 -> [0,1] [1,1] | 1 0 [0,1] [end,end] +1 (=1) 1 -> [1,1] [0,1] | 1 1 [1,1]
982 ! [beg,end]
983
984 send_tag = f_logical_to_int(.not. f_xor(beg_end_geq_0, pbc_loc == 1))
985 recv_tag = f_logical_to_int(pbc_loc == 1)
986
987 dst_proc = beg_end(1 + f_logical_to_int(f_xor(pbc_loc == 1, beg_end_geq_0)))
988 src_proc = beg_end(1 + f_logical_to_int(pbc_loc == 1))
989
990 grid_dims = (/m, n, p/)
991
992 pack_offset = 0
993 if (f_xor(pbc_loc == 1, beg_end_geq_0)) then
994 pack_offset = grid_dims(mpi_dir) - buff_size + 1
995 end if
996
997 unpack_offset = 0
998 if (pbc_loc == 1) then
999 unpack_offset = grid_dims(mpi_dir) + buff_size + 1
1000 end if
1001
1002 ! Pack Buffer to Send
1003# 571 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1004 if (mpi_dir == 1) then
1005# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1006
1007# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1008
1009# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1010#if defined(MFC_OpenACC)
1011# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1012!$acc parallel loop collapse(4) gang vector default(present) private(r)
1013# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1014#elif defined(MFC_OpenMP)
1015# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1016
1017# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1018
1019# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1020
1021# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1022!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1023# 573 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1024#endif
1025 do l = 0, p
1026 do k = 0, n
1027 do j = 0, buff_size - 1
1028 do i = 1, nvar
1029 r = (i - 1) + v_size*(j + buff_size*(k + (n + 1)*l))
1030 buff_send(r) = real(q_comm(i)%sf(j + pack_offset, k, l), kind=wp)
1031 end do
1032 end do
1033 end do
1034 end do
1035
1036# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1037#if defined(MFC_OpenACC)
1038# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1039!$acc end parallel loop
1040# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1041#elif defined(MFC_OpenMP)
1042# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1043
1044# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1045!$omp end target teams loop
1046# 584 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1047#endif
1048
1049 if (chem_diff_comm) then
1050
1051# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1052
1053# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1054#if defined(MFC_OpenACC)
1055# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1056!$acc parallel loop collapse(3) gang vector default(present) private(r)
1057# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1058#elif defined(MFC_OpenMP)
1059# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1060
1061# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1062
1063# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1064
1065# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1066!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1067# 587 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1068#endif
1069 do l = 0, p
1070 do k = 0, n
1071 do j = 0, buff_size - 1
1072 r = nvar + v_size*(j + buff_size*(k + (n + 1)*l))
1073 buff_send(r) = real(q_t_sf%sf(j + pack_offset, k, l), kind=wp)
1074 end do
1075 end do
1076 end do
1077
1078# 596 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1079#if defined(MFC_OpenACC)
1080# 596 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1081!$acc end parallel loop
1082# 596 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1083#elif defined(MFC_OpenMP)
1084# 596 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1085
1086# 596 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1087!$omp end target teams loop
1088# 596 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1089#endif
1090 end if
1091
1092 if (qbmm_comm) then
1093
1094# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1095
1096# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1097#if defined(MFC_OpenACC)
1098# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1099!$acc parallel loop collapse(4) gang vector default(present) private(r)
1100# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1101#elif defined(MFC_OpenMP)
1102# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1103
1104# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1105
1106# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1107
1108# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1109!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1110# 600 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1111#endif
1112 do l = 0, p
1113 do k = 0, n
1114 do j = 0, buff_size - 1
1115 do i = nvar + 1, nvar + nnode
1116 do q = 1, nb
1117 r = (i - 1) + (q - 1)*nnode + v_size*(j + buff_size*(k + (n + 1)*l))
1118 buff_send(r) = real(pb_in(j + pack_offset, k, l, i - nvar, q), kind=wp)
1119 end do
1120 end do
1121 end do
1122 end do
1123 end do
1124
1125# 613 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1126#if defined(MFC_OpenACC)
1127# 613 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1128!$acc end parallel loop
1129# 613 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1130#elif defined(MFC_OpenMP)
1131# 613 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1132
1133# 613 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1134!$omp end target teams loop
1135# 613 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1136#endif
1137
1138
1139# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1140
1141# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1142#if defined(MFC_OpenACC)
1143# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1144!$acc parallel loop collapse(5) gang vector default(present) private(r)
1145# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1146#elif defined(MFC_OpenMP)
1147# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1148
1149# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1150
1151# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1152
1153# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1154!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1155# 615 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1156#endif
1157 do l = 0, p
1158 do k = 0, n
1159 do j = 0, buff_size - 1
1160 do i = nvar + 1, nvar + nnode
1161 do q = 1, nb
1162 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*(j + buff_size*(k + (n + 1)*l))
1163 buff_send(r) = real(mv_in(j + pack_offset, k, l, i - nvar, q), kind=wp)
1164 end do
1165 end do
1166 end do
1167 end do
1168 end do
1169
1170# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1171#if defined(MFC_OpenACC)
1172# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1173!$acc end parallel loop
1174# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1175#elif defined(MFC_OpenMP)
1176# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1177
1178# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1179!$omp end target teams loop
1180# 628 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1181#endif
1182 end if
1183# 753 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1184 end if
1185# 571 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1186 if (mpi_dir == 2) then
1187# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1188
1189# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1190
1191# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1192#if defined(MFC_OpenACC)
1193# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1194!$acc parallel loop collapse(4) gang vector default(present) private(r)
1195# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1196#elif defined(MFC_OpenMP)
1197# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1198
1199# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1200
1201# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1202
1203# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1204!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1205# 631 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1206#endif
1207 do i = 1, nvar
1208 do l = 0, p
1209 do k = 0, buff_size - 1
1210 do j = -buff_size, m + buff_size
1211 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k + buff_size*l))
1212 buff_send(r) = real(q_comm(i)%sf(j, k + pack_offset, l), kind=wp)
1213 end do
1214 end do
1215 end do
1216 end do
1217
1218# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1219#if defined(MFC_OpenACC)
1220# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1221!$acc end parallel loop
1222# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1223#elif defined(MFC_OpenMP)
1224# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1225
1226# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1227!$omp end target teams loop
1228# 642 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1229#endif
1230
1231 if (chem_diff_comm) then
1232
1233# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1234
1235# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1236#if defined(MFC_OpenACC)
1237# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1238!$acc parallel loop collapse(3) gang vector default(present) private(r)
1239# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1240#elif defined(MFC_OpenMP)
1241# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1242
1243# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1244
1245# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1246
1247# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1248!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1249# 645 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1250#endif
1251 do l = 0, p
1252 do k = 0, buff_size - 1
1253 do j = -buff_size, m + buff_size
1254 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k + buff_size*l))
1255 buff_send(r) = real(q_t_sf%sf(j, k + pack_offset, l), kind=wp)
1256 end do
1257 end do
1258 end do
1259
1260# 654 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1261#if defined(MFC_OpenACC)
1262# 654 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1263!$acc end parallel loop
1264# 654 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1265#elif defined(MFC_OpenMP)
1266# 654 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1267
1268# 654 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1269!$omp end target teams loop
1270# 654 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1271#endif
1272 end if
1273
1274 if (qbmm_comm) then
1275
1276# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1277
1278# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1279#if defined(MFC_OpenACC)
1280# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1281!$acc parallel loop collapse(5) gang vector default(present) private(r)
1282# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1283#elif defined(MFC_OpenMP)
1284# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1285
1286# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1287
1288# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1289
1290# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1291!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1292# 658 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1293#endif
1294 do i = nvar + 1, nvar + nnode
1295 do l = 0, p
1296 do k = 0, buff_size - 1
1297 do j = -buff_size, m + buff_size
1298 do q = 1, nb
1299 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*(k &
1300 & + buff_size*l))
1301 buff_send(r) = real(pb_in(j, k + pack_offset, l, i - nvar, q), kind=wp)
1302 end do
1303 end do
1304 end do
1305 end do
1306 end do
1307
1308# 672 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1309#if defined(MFC_OpenACC)
1310# 672 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1311!$acc end parallel loop
1312# 672 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1313#elif defined(MFC_OpenMP)
1314# 672 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1315
1316# 672 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1317!$omp end target teams loop
1318# 672 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1319#endif
1320
1321
1322# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1323
1324# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1325#if defined(MFC_OpenACC)
1326# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1327!$acc parallel loop collapse(5) gang vector default(present) private(r)
1328# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1329#elif defined(MFC_OpenMP)
1330# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1331
1332# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1333
1334# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1335
1336# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1337!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1338# 674 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1339#endif
1340 do i = nvar + 1, nvar + nnode
1341 do l = 0, p
1342 do k = 0, buff_size - 1
1343 do j = -buff_size, m + buff_size
1344 do q = 1, nb
1345 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
1346 & + 1)*(k + buff_size*l))
1347 buff_send(r) = real(mv_in(j, k + pack_offset, l, i - nvar, q), kind=wp)
1348 end do
1349 end do
1350 end do
1351 end do
1352 end do
1353
1354# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1355#if defined(MFC_OpenACC)
1356# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1357!$acc end parallel loop
1358# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1359#elif defined(MFC_OpenMP)
1360# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1361
1362# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1363!$omp end target teams loop
1364# 688 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1365#endif
1366 end if
1367# 753 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1368 end if
1369# 571 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1370 if (mpi_dir == 3) then
1371# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1372
1373# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1374
1375# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1376#if defined(MFC_OpenACC)
1377# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1378!$acc parallel loop collapse(4) gang vector default(present) private(r)
1379# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1380#elif defined(MFC_OpenMP)
1381# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1382
1383# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1384
1385# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1386
1387# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1388!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1389# 691 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1390#endif
1391 do i = 1, nvar
1392 do l = 0, buff_size - 1
1393 do k = -buff_size, n + buff_size
1394 do j = -buff_size, m + buff_size
1395 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
1396 & + 2*buff_size + 1)*l))
1397 buff_send(r) = real(q_comm(i)%sf(j, k, l + pack_offset), kind=wp)
1398 end do
1399 end do
1400 end do
1401 end do
1402
1403# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1404#if defined(MFC_OpenACC)
1405# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1406!$acc end parallel loop
1407# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1408#elif defined(MFC_OpenMP)
1409# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1410
1411# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1412!$omp end target teams loop
1413# 703 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1414#endif
1415
1416 if (chem_diff_comm) then
1417
1418# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1419
1420# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1421#if defined(MFC_OpenACC)
1422# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1423!$acc parallel loop collapse(3) gang vector default(present) private(r)
1424# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1425#elif defined(MFC_OpenMP)
1426# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1427
1428# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1429
1430# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1431
1432# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1433!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1434# 706 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1435#endif
1436 do l = 0, buff_size - 1
1437 do k = -buff_size, n + buff_size
1438 do j = -buff_size, m + buff_size
1439 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
1440 & + 2*buff_size + 1)*l))
1441 buff_send(r) = real(q_t_sf%sf(j, k, l + pack_offset), kind=wp)
1442 end do
1443 end do
1444 end do
1445
1446# 716 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1447#if defined(MFC_OpenACC)
1448# 716 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1449!$acc end parallel loop
1450# 716 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1451#elif defined(MFC_OpenMP)
1452# 716 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1453
1454# 716 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1455!$omp end target teams loop
1456# 716 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1457#endif
1458 end if
1459
1460 if (qbmm_comm) then
1461
1462# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1463
1464# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1465#if defined(MFC_OpenACC)
1466# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1467!$acc parallel loop collapse(5) gang vector default(present) private(r)
1468# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1469#elif defined(MFC_OpenMP)
1470# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1471
1472# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1473
1474# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1475
1476# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1477!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1478# 720 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1479#endif
1480 do i = nvar + 1, nvar + nnode
1481 do l = 0, buff_size - 1
1482 do k = -buff_size, n + buff_size
1483 do j = -buff_size, m + buff_size
1484 do q = 1, nb
1485 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
1486 & + buff_size) + (n + 2*buff_size + 1)*l))
1487 buff_send(r) = real(pb_in(j, k, l + pack_offset, i - nvar, q), kind=wp)
1488 end do
1489 end do
1490 end do
1491 end do
1492 end do
1493
1494# 734 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1495#if defined(MFC_OpenACC)
1496# 734 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1497!$acc end parallel loop
1498# 734 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1499#elif defined(MFC_OpenMP)
1500# 734 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1501
1502# 734 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1503!$omp end target teams loop
1504# 734 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1505#endif
1506
1507
1508# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1509
1510# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1511#if defined(MFC_OpenACC)
1512# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1513!$acc parallel loop collapse(5) gang vector default(present) private(r)
1514# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1515#elif defined(MFC_OpenMP)
1516# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1517
1518# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1519
1520# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1521
1522# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1523!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1524# 736 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1525#endif
1526 do i = nvar + 1, nvar + nnode
1527 do l = 0, buff_size - 1
1528 do k = -buff_size, n + buff_size
1529 do j = -buff_size, m + buff_size
1530 do q = 1, nb
1531 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
1532 & + 1)*((k + buff_size) + (n + 2*buff_size + 1)*l))
1533 buff_send(r) = real(mv_in(j, k, l + pack_offset, i - nvar, q), kind=wp)
1534 end do
1535 end do
1536 end do
1537 end do
1538 end do
1539
1540# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1541#if defined(MFC_OpenACC)
1542# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1543!$acc end parallel loop
1544# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1545#elif defined(MFC_OpenMP)
1546# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1547
1548# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1549!$omp end target teams loop
1550# 750 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1551#endif
1552 end if
1553# 753 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1554 end if
1555# 755 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1556 call nvtxendrange ! Packbuf
1557
1558 ! Send/Recv
1559#ifdef MFC_SIMULATION
1560# 760 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1561 if (rdma_mpi .eqv. .false.) then
1562# 772 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1563 call nvtxstartrange("RHS-COMM-DEV2HOST")
1564
1565# 773 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1566#if defined(MFC_OpenACC)
1567# 773 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1568!$acc update host(buff_send)
1569# 773 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1570#elif defined(MFC_OpenMP)
1571# 773 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1572!$omp target update from(buff_send)
1573# 773 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1574#endif
1575 call nvtxendrange
1576 call nvtxstartrange("RHS-COMM-SENDRECV-NO-RMDA")
1577
1578 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1579 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1580
1581 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1582
1583 call nvtxstartrange("RHS-COMM-HOST2DEV")
1584
1585# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1586#if defined(MFC_OpenACC)
1587# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1588!$acc update device(buff_recv)
1589# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1590#elif defined(MFC_OpenMP)
1591# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1592!$omp target update to(buff_recv)
1593# 783 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1594#endif
1595 call nvtxendrange
1596# 786 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1597 end if
1598# 760 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1599 if (rdma_mpi .eqv. .true.) then
1600# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1601
1602# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1603#if defined(MFC_OpenACC)
1604# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1605!$acc host_data use_device(buff_send, buff_recv)
1606# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1607 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1608# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1609
1610# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1611 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1612# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1613 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1614# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1615
1616# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1617 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1618# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1619!$acc end host_data
1620# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1621#elif defined(MFC_OpenMP)
1622# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1623!$omp target data use_device_addr(buff_send, buff_recv)
1624# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1625 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1626# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1627
1628# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1629 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1630# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1631 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1632# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1633
1634# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1635 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1636# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1637!$omp end target data
1638# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1639#else
1640# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1641 call nvtxstartrange("RHS-COMM-SENDRECV-RDMA")
1642# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1643
1644# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1645 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, &
1646# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1647 & src_proc, recv_tag, mpi_comm_world, mpi_status_ignore, ierr)
1648# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1649
1650# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1651 call nvtxendrange ! RHS-MPI-SENDRECV-(NO)-RDMA
1652# 762 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1653#endif
1654# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1655
1656# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1657#if defined(MFC_OpenACC)
1658# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1659!$acc wait
1660# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1661#elif defined(MFC_OpenMP)
1662# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1663!$omp barrier
1664# 770 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1665#endif
1666# 786 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1667 end if
1668# 788 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1669#else
1670 call mpi_sendrecv(buff_send, buffer_count, mpi_p, dst_proc, send_tag, buff_recv, buffer_count, mpi_p, src_proc, recv_tag, &
1671 & mpi_comm_world, mpi_status_ignore, ierr)
1672#endif
1673
1674 ! Unpack Received Buffer
1675 call nvtxstartrange("RHS-COMM-UNPACKBUF")
1676# 796 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1677 if (mpi_dir == 1) then
1678# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1679
1680# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1681
1682# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1683#if defined(MFC_OpenACC)
1684# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1685!$acc parallel loop collapse(4) gang vector default(present) private(r)
1686# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1687#elif defined(MFC_OpenMP)
1688# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1689
1690# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1691
1692# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1693
1694# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1695!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1696# 798 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1697#endif
1698 do l = 0, p
1699 do k = 0, n
1700 do j = -buff_size, -1
1701 do i = 1, nvar
1702 r = (i - 1) + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1703 q_comm(i)%sf(j + unpack_offset, k, l) = real(buff_recv(r), kind=stp)
1704#if defined(__INTEL_COMPILER)
1705 if (ieee_is_nan(q_comm(i)%sf(j + unpack_offset, k, l))) then
1706 print *, "Error", j, k, l, i
1707 call s_mpi_abort("NaN(s) in recv")
1708 end if
1709#endif
1710 end do
1711 end do
1712 end do
1713 end do
1714
1715# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1716#if defined(MFC_OpenACC)
1717# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1718!$acc end parallel loop
1719# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1720#elif defined(MFC_OpenMP)
1721# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1722
1723# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1724!$omp end target teams loop
1725# 815 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1726#endif
1727
1728 if (chem_diff_comm) then
1729
1730# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1731
1732# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1733#if defined(MFC_OpenACC)
1734# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1735!$acc parallel loop collapse(3) gang vector default(present) private(r)
1736# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1737#elif defined(MFC_OpenMP)
1738# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1739
1740# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1741
1742# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1743
1744# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1745!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1746# 818 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1747#endif
1748 do l = 0, p
1749 do k = 0, n
1750 do j = -buff_size, -1
1751 r = nvar + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1752 q_t_sf%sf(j + unpack_offset, k, l) = real(buff_recv(r), kind=stp)
1753#if defined(__INTEL_COMPILER)
1754 if (ieee_is_nan(q_t_sf%sf(j + unpack_offset, k, l))) then
1755 print *, "Error", j, k, l
1756 call s_mpi_abort("NaN(s) in recv")
1757 end if
1758#endif
1759 end do
1760 end do
1761 end do
1762
1763# 833 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1764#if defined(MFC_OpenACC)
1765# 833 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1766!$acc end parallel loop
1767# 833 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1768#elif defined(MFC_OpenMP)
1769# 833 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1770
1771# 833 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1772!$omp end target teams loop
1773# 833 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1774#endif
1775 end if
1776
1777 if (qbmm_comm) then
1778
1779# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1780
1781# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1782#if defined(MFC_OpenACC)
1783# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1784!$acc parallel loop collapse(5) gang vector default(present) private(r)
1785# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1786#elif defined(MFC_OpenMP)
1787# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1788
1789# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1790
1791# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1792
1793# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1794!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1795# 837 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1796#endif
1797 do l = 0, p
1798 do k = 0, n
1799 do j = -buff_size, -1
1800 do i = nvar + 1, nvar + nnode
1801 do q = 1, nb
1802 r = (i - 1) + (q - 1)*nnode + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1803 pb_in(j + unpack_offset, k, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1804 end do
1805 end do
1806 end do
1807 end do
1808 end do
1809
1810# 850 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1811#if defined(MFC_OpenACC)
1812# 850 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1813!$acc end parallel loop
1814# 850 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1815#elif defined(MFC_OpenMP)
1816# 850 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1817
1818# 850 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1819!$omp end target teams loop
1820# 850 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1821#endif
1822
1823
1824# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1825
1826# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1827#if defined(MFC_OpenACC)
1828# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1829!$acc parallel loop collapse(5) gang vector default(present) private(r)
1830# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1831#elif defined(MFC_OpenMP)
1832# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1833
1834# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1835
1836# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1837
1838# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1839!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1840# 852 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1841#endif
1842 do l = 0, p
1843 do k = 0, n
1844 do j = -buff_size, -1
1845 do i = nvar + 1, nvar + nnode
1846 do q = 1, nb
1847 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*(j + buff_size*((k + 1) + (n + 1)*l))
1848 mv_in(j + unpack_offset, k, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1849 end do
1850 end do
1851 end do
1852 end do
1853 end do
1854
1855# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1856#if defined(MFC_OpenACC)
1857# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1858!$acc end parallel loop
1859# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1860#elif defined(MFC_OpenMP)
1861# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1862
1863# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1864!$omp end target teams loop
1865# 865 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1866#endif
1867 end if
1868# 1014 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1869 end if
1870# 796 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1871 if (mpi_dir == 2) then
1872# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1873
1874# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1875
1876# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1877#if defined(MFC_OpenACC)
1878# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1879!$acc parallel loop collapse(4) gang vector default(present) private(r)
1880# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1881#elif defined(MFC_OpenMP)
1882# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1883
1884# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1885
1886# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1887
1888# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1889!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1890# 868 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1891#endif
1892 do i = 1, nvar
1893 do l = 0, p
1894 do k = -buff_size, -1
1895 do j = -buff_size, m + buff_size
1896 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + buff_size*l))
1897 q_comm(i)%sf(j, k + unpack_offset, l) = real(buff_recv(r), kind=stp)
1898#if defined(__INTEL_COMPILER)
1899 if (ieee_is_nan(q_comm(i)%sf(j, k + unpack_offset, l))) then
1900 print *, "Error", j, k, l, i
1901 call s_mpi_abort("NaN(s) in recv")
1902 end if
1903#endif
1904 end do
1905 end do
1906 end do
1907 end do
1908
1909# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1910#if defined(MFC_OpenACC)
1911# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1912!$acc end parallel loop
1913# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1914#elif defined(MFC_OpenMP)
1915# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1916
1917# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1918!$omp end target teams loop
1919# 885 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1920#endif
1921
1922 if (chem_diff_comm) then
1923
1924# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1925
1926# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1927#if defined(MFC_OpenACC)
1928# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1929!$acc parallel loop collapse(3) gang vector default(present) private(r)
1930# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1931#elif defined(MFC_OpenMP)
1932# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1933
1934# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1935
1936# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1937
1938# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1939!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1940# 888 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1941#endif
1942 do l = 0, p
1943 do k = -buff_size, -1
1944 do j = -buff_size, m + buff_size
1945 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + buff_size*l))
1946 q_t_sf%sf(j, k + unpack_offset, l) = real(buff_recv(r), kind=stp)
1947#if defined(__INTEL_COMPILER)
1948 if (ieee_is_nan(q_t_sf%sf(j, k + unpack_offset, l))) then
1949 print *, "Error", j, k, l
1950 call s_mpi_abort("NaN(s) in recv")
1951 end if
1952#endif
1953 end do
1954 end do
1955 end do
1956
1957# 903 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1958#if defined(MFC_OpenACC)
1959# 903 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1960!$acc end parallel loop
1961# 903 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1962#elif defined(MFC_OpenMP)
1963# 903 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1964
1965# 903 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1966!$omp end target teams loop
1967# 903 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1968#endif
1969 end if
1970
1971 if (qbmm_comm) then
1972
1973# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1974
1975# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1976#if defined(MFC_OpenACC)
1977# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1978!$acc parallel loop collapse(5) gang vector default(present) private(r)
1979# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1980#elif defined(MFC_OpenMP)
1981# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1982
1983# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1984
1985# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1986
1987# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1988!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
1989# 907 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
1990#endif
1991 do i = nvar + 1, nvar + nnode
1992 do l = 0, p
1993 do k = -buff_size, -1
1994 do j = -buff_size, m + buff_size
1995 do q = 1, nb
1996 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
1997 & + buff_size) + buff_size*l))
1998 pb_in(j, k + unpack_offset, l, i - nvar, q) = real(buff_recv(r), kind=stp)
1999 end do
2000 end do
2001 end do
2002 end do
2003 end do
2004
2005# 921 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2006#if defined(MFC_OpenACC)
2007# 921 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2008!$acc end parallel loop
2009# 921 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2010#elif defined(MFC_OpenMP)
2011# 921 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2012
2013# 921 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2014!$omp end target teams loop
2015# 921 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2016#endif
2017
2018
2019# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2020
2021# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2022#if defined(MFC_OpenACC)
2023# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2024!$acc parallel loop collapse(5) gang vector default(present) private(r)
2025# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2026#elif defined(MFC_OpenMP)
2027# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2028
2029# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2030
2031# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2032
2033# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2034!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2035# 923 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2036#endif
2037 do i = nvar + 1, nvar + nnode
2038 do l = 0, p
2039 do k = -buff_size, -1
2040 do j = -buff_size, m + buff_size
2041 do q = 1, nb
2042 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
2043 & + 1)*((k + buff_size) + buff_size*l))
2044 mv_in(j, k + unpack_offset, l, i - nvar, q) = real(buff_recv(r), kind=stp)
2045 end do
2046 end do
2047 end do
2048 end do
2049 end do
2050
2051# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2052#if defined(MFC_OpenACC)
2053# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2054!$acc end parallel loop
2055# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2056#elif defined(MFC_OpenMP)
2057# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2058
2059# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2060!$omp end target teams loop
2061# 937 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2062#endif
2063 end if
2064# 1014 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2065 end if
2066# 796 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2067 if (mpi_dir == 3) then
2068# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2069
2070# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2071
2072# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2073#if defined(MFC_OpenACC)
2074# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2075!$acc parallel loop collapse(4) gang vector default(present) private(r)
2076# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2077#elif defined(MFC_OpenMP)
2078# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2079
2080# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2081
2082# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2083
2084# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2085!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(4) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2086# 940 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2087#endif
2088 do i = 1, nvar
2089 do l = -buff_size, -1
2090 do k = -buff_size, n + buff_size
2091 do j = -buff_size, m + buff_size
2092 r = (i - 1) + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
2093 & + 2*buff_size + 1)*(l + buff_size)))
2094 q_comm(i)%sf(j, k, l + unpack_offset) = real(buff_recv(r), kind=stp)
2095#if defined(__INTEL_COMPILER)
2096 if (ieee_is_nan(q_comm(i)%sf(j, k, l + unpack_offset))) then
2097 print *, "Error", j, k, l, i
2098 call s_mpi_abort("NaN(s) in recv")
2099 end if
2100#endif
2101 end do
2102 end do
2103 end do
2104 end do
2105
2106# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2107#if defined(MFC_OpenACC)
2108# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2109!$acc end parallel loop
2110# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2111#elif defined(MFC_OpenMP)
2112# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2113
2114# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2115!$omp end target teams loop
2116# 958 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2117#endif
2118
2119 if (chem_diff_comm) then
2120
2121# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2122
2123# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2124#if defined(MFC_OpenACC)
2125# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2126!$acc parallel loop collapse(3) gang vector default(present) private(r)
2127# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2128#elif defined(MFC_OpenMP)
2129# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2130
2131# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2132
2133# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2134
2135# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2136!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(3) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2137# 961 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2138#endif
2139 do l = -buff_size, -1
2140 do k = -buff_size, n + buff_size
2141 do j = -buff_size, m + buff_size
2142 r = nvar + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k + buff_size) + (n &
2143 & + 2*buff_size + 1)*(l + buff_size)))
2144 q_t_sf%sf(j, k, l + unpack_offset) = real(buff_recv(r), kind=stp)
2145#if defined(__INTEL_COMPILER)
2146 if (ieee_is_nan(q_t_sf%sf(j, k, l + unpack_offset))) then
2147 print *, "Error", j, k, l
2148 call s_mpi_abort("NaN(s) in recv")
2149 end if
2150#endif
2151 end do
2152 end do
2153 end do
2154
2155# 977 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2156#if defined(MFC_OpenACC)
2157# 977 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2158!$acc end parallel loop
2159# 977 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2160#elif defined(MFC_OpenMP)
2161# 977 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2162
2163# 977 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2164!$omp end target teams loop
2165# 977 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2166#endif
2167 end if
2168
2169 if (qbmm_comm) then
2170
2171# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2172
2173# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2174#if defined(MFC_OpenACC)
2175# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2176!$acc parallel loop collapse(5) gang vector default(present) private(r)
2177# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2178#elif defined(MFC_OpenMP)
2179# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2180
2181# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2182
2183# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2184
2185# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2186!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2187# 981 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2188#endif
2189 do i = nvar + 1, nvar + nnode
2190 do l = -buff_size, -1
2191 do k = -buff_size, n + buff_size
2192 do j = -buff_size, m + buff_size
2193 do q = 1, nb
2194 r = (i - 1) + (q - 1)*nnode + v_size*((j + buff_size) + (m + 2*buff_size + 1)*((k &
2195 & + buff_size) + (n + 2*buff_size + 1)*(l + buff_size)))
2196 pb_in(j, k, l + unpack_offset, i - nvar, q) = real(buff_recv(r), kind=stp)
2197 end do
2198 end do
2199 end do
2200 end do
2201 end do
2202
2203# 995 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2204#if defined(MFC_OpenACC)
2205# 995 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2206!$acc end parallel loop
2207# 995 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2208#elif defined(MFC_OpenMP)
2209# 995 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2210
2211# 995 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2212!$omp end target teams loop
2213# 995 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2214#endif
2215
2216
2217# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2218
2219# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2220#if defined(MFC_OpenACC)
2221# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2222!$acc parallel loop collapse(5) gang vector default(present) private(r)
2223# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2224#elif defined(MFC_OpenMP)
2225# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2226
2227# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2228
2229# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2230
2231# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2232!$omp target teams loop defaultmap(firstprivate:scalar) bind(teams,parallel) collapse(5) defaultmap(tofrom:aggregate) defaultmap(tofrom:allocatable) defaultmap(tofrom:pointer) private(r)
2233# 997 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2234#endif
2235 do i = nvar + 1, nvar + nnode
2236 do l = -buff_size, -1
2237 do k = -buff_size, n + buff_size
2238 do j = -buff_size, m + buff_size
2239 do q = 1, nb
2240 r = (i - 1) + (q - 1)*nnode + nb*nnode + v_size*((j + buff_size) + (m + 2*buff_size &
2241 & + 1)*((k + buff_size) + (n + 2*buff_size + 1)*(l + buff_size)))
2242 mv_in(j, k, l + unpack_offset, i - nvar, q) = real(buff_recv(r), kind=stp)
2243 end do
2244 end do
2245 end do
2246 end do
2247 end do
2248
2249# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2250#if defined(MFC_OpenACC)
2251# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2252!$acc end parallel loop
2253# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2254#elif defined(MFC_OpenMP)
2255# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2256
2257# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2258!$omp end target teams loop
2259# 1011 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2260#endif
2261 end if
2262# 1014 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2263 end if
2264# 1016 "/home/runner/work/MFC/MFC/src/common/m_mpi_common.fpp"
2265 call nvtxendrange
2266#endif
2267
2269
2270 !> Decompose the computational domain among processors by balancing cells per rank in each coordinate direction.
2272
2273#ifdef MFC_MPI
2274 integer :: num_procs_x, num_procs_y, num_procs_z !< Optimal number of processors in the x-, y- and z-directions
2275 !> Non-optimal number of processors in the x-, y- and z-directions
2276 real(wp) :: tmp_num_procs_x, tmp_num_procs_y, tmp_num_procs_z
2277 real(wp) :: fct_min !< Processor factorization (fct) minimization parameter
2278 integer :: MPI_COMM_CART !< Cartesian processor topology communicator
2279 integer :: rem_cells !< Remaining cells after distribution among processors
2280 integer :: recon_order !< WENO or MUSCL reconstruction order
2281 integer :: i, j !< Generic loop iterators
2282 integer :: ierr !< Generic flag used to identify and report MPI errors
2283
2284 if (recon_type == recon_type_weno) then
2285 recon_order = weno_order
2286 else
2287 recon_order = muscl_order
2288 end if
2289
2290 if (num_procs == 1 .and. parallel_io) then
2291 do i = 1, num_dims
2292 start_idx(i) = 0
2293 end do
2294 return
2295 end if
2296
2297 if (igr) then
2298 recon_order = igr_order
2299 end if
2300
2301 ! 3D Cartesian Processor Topology
2302 if (n > 0) then
2303 if (p > 0) then
2304 if (fft_wrt) then
2305 ! Initial estimate of optimal processor topology
2306 num_procs_x = 1
2307 num_procs_y = 1
2308 num_procs_z = num_procs
2309 ierr = -1
2310
2311 ! Benchmarking the quality of this initial guess
2312 tmp_num_procs_y = num_procs_y
2313 tmp_num_procs_z = num_procs_z
2314 fct_min = 10._wp*abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2315
2316 ! Optimization of the initial processor topology
2317 do i = 1, num_procs
2318 if (mod(num_procs, i) == 0 .and. (n + 1)/i >= num_stcls_min*recon_order) then
2319 tmp_num_procs_y = i
2320 tmp_num_procs_z = num_procs/i
2321
2322 if (fct_min >= abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z) .and. (p + 1) &
2323 & /tmp_num_procs_z >= num_stcls_min*recon_order) then
2324 num_procs_y = i
2325 num_procs_z = num_procs/i
2326 fct_min = abs((n + 1)/tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2327 ierr = 0
2328 end if
2329 end if
2330 end do
2331 else
2332 if (cyl_coord .and. p > 0) then
2333 ! Pencil blocking for cylindrical coordinates (Fourier filter near axis)
2334
2335 ! Initial values of the processor factorization optimization
2336 num_procs_x = 1
2337 num_procs_y = num_procs
2338 num_procs_z = 1
2339 ierr = -1
2340
2341 ! Computing minimization variable for these initial values
2342 tmp_num_procs_x = num_procs_x
2343 tmp_num_procs_y = num_procs_y
2344 tmp_num_procs_z = num_procs_z
2345 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2346
2347 ! Searching for optimal computational domain distribution
2348 do i = 1, num_procs
2349 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2350 tmp_num_procs_x = i
2351 tmp_num_procs_y = num_procs/i
2352
2353 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) .and. (n + 1) &
2354 & /tmp_num_procs_y >= num_stcls_min*recon_order) then
2355 num_procs_x = i
2356 num_procs_y = num_procs/i
2357 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2358 ierr = 0
2359 end if
2360 end if
2361 end do
2362 else
2363 ! Initial estimate of optimal processor topology
2364 num_procs_x = 1
2365 num_procs_y = 1
2366 num_procs_z = num_procs
2367 ierr = -1
2368
2369 ! Benchmarking the quality of this initial guess
2370 tmp_num_procs_x = num_procs_x
2371 tmp_num_procs_y = num_procs_y
2372 tmp_num_procs_z = num_procs_z
2373 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + 10._wp*abs((n + 1) &
2374 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2375
2376 ! Optimization of the initial processor topology
2377 do i = 1, num_procs
2378 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2379 do j = 1, num_procs/i
2380 if (mod(num_procs/i, j) == 0 .and. (n + 1)/j >= num_stcls_min*recon_order) then
2381 tmp_num_procs_x = i
2382 tmp_num_procs_y = j
2383 tmp_num_procs_z = num_procs/(i*j)
2384
2385 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + abs((n + 1) &
2386 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z) .and. (p + 1) &
2387 & /tmp_num_procs_z >= num_stcls_min*recon_order) then
2388 num_procs_x = i
2389 num_procs_y = j
2390 num_procs_z = num_procs/(i*j)
2391 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) + abs((n + 1) &
2392 & /tmp_num_procs_y - (p + 1)/tmp_num_procs_z)
2393 ierr = 0
2394 end if
2395 end if
2396 end do
2397 end if
2398 end do
2399 end if
2400 end if
2401
2402 ! Verifying that a valid decomposition of the computational domain has been established. If not, the simulation
2403 ! exits.
2404 if (proc_rank == 0 .and. ierr == -1) then
2405 call s_mpi_abort('Unsupported combination of values ' // 'of num_procs, m, n, p and ' &
2406 & // 'weno/muscl/igr_order. Exiting.')
2407 end if
2408
2409 ! Creating new communicator using the Cartesian topology
2410 call mpi_cart_create(mpi_comm_world, 3, (/num_procs_x, num_procs_y, num_procs_z/), (/.true., .true., .true./), &
2411 & .false., mpi_comm_cart, ierr)
2412
2413 ! Finding the Cartesian coordinates of the local process
2414 call mpi_cart_coords(mpi_comm_cart, proc_rank, 3, proc_coords, ierr)
2415
2416 ! Global Parameters for z-direction
2417
2418 ! Number of remaining cells
2419 rem_cells = mod(p + 1, num_procs_z)
2420
2421 ! Optimal number of cells per processor
2422 p = (p + 1)/num_procs_z - 1
2423
2424 ! Distributing the remaining cells
2425 do i = 1, rem_cells
2426 if (proc_coords(3) == i - 1) then
2427 p = p + 1; exit
2428 end if
2429 end do
2430
2431 ! Boundary condition at the beginning
2432 if (proc_coords(3) > 0 .or. (bc_z%beg == bc_periodic .and. num_procs_z > 1)) then
2433 proc_coords(3) = proc_coords(3) - 1
2434 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_z%beg, ierr)
2435 proc_coords(3) = proc_coords(3) + 1
2436 end if
2437
2438 ! Boundary condition at the end
2439 if (proc_coords(3) < num_procs_z - 1 .or. (bc_z%end == bc_periodic .and. num_procs_z > 1)) then
2440 proc_coords(3) = proc_coords(3) + 1
2441 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_z%end, ierr)
2442 proc_coords(3) = proc_coords(3) - 1
2443 end if
2444
2445#ifdef MFC_POST_PROCESS
2446 ! Ghost zone at the beginning
2447 if (proc_coords(3) > 0 .and. format == format_silo) then
2448 offset_z%beg = 2
2449 else
2450 offset_z%beg = 0
2451 end if
2452
2453 ! Ghost zone at the end
2454 if (proc_coords(3) < num_procs_z - 1 .and. format == format_silo) then
2455 offset_z%end = 2
2456 else
2457 offset_z%end = 0
2458 end if
2459#endif
2460
2461 ! Beginning and end sub-domain boundary locations
2462 if (parallel_io) then
2463 if (proc_coords(3) < rem_cells) then
2464 start_idx(3) = (p + 1)*proc_coords(3)
2465 else
2466 start_idx(3) = (p + 1)*proc_coords(3) + rem_cells
2467 end if
2468 else
2469#ifdef MFC_PRE_PROCESS
2470 if (old_grid .neqv. .true.) then
2471 dz = (z_domain%end - z_domain%beg)/real(p_glb + 1, wp)
2472
2473 if (proc_coords(3) < rem_cells) then
2474 z_domain%beg = z_domain%beg + dz*real((p + 1)*proc_coords(3))
2475 z_domain%end = z_domain%end - dz*real((p + 1)*(num_procs_z - proc_coords(3) - 1) - (num_procs_z &
2476 & - rem_cells))
2477 else
2478 z_domain%beg = z_domain%beg + dz*real((p + 1)*proc_coords(3) + rem_cells)
2479 z_domain%end = z_domain%end - dz*real((p + 1)*(num_procs_z - proc_coords(3) - 1))
2480 end if
2481 end if
2482#endif
2483 end if
2484
2485 ! 2D Cartesian Processor Topology
2486 else
2487 ! Initial estimate of optimal processor topology
2488 num_procs_x = 1
2489 num_procs_y = num_procs
2490 ierr = -1
2491
2492 ! Benchmarking the quality of this initial guess
2493 tmp_num_procs_x = num_procs_x
2494 tmp_num_procs_y = num_procs_y
2495 fct_min = 10._wp*abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2496
2497 ! Optimization of the initial processor topology
2498 do i = 1, num_procs
2499 if (mod(num_procs, i) == 0 .and. (m + 1)/i >= num_stcls_min*recon_order) then
2500 tmp_num_procs_x = i
2501 tmp_num_procs_y = num_procs/i
2502
2503 if (fct_min >= abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y) .and. (n + 1) &
2504 & /tmp_num_procs_y >= num_stcls_min*recon_order) then
2505 num_procs_x = i
2506 num_procs_y = num_procs/i
2507 fct_min = abs((m + 1)/tmp_num_procs_x - (n + 1)/tmp_num_procs_y)
2508 ierr = 0
2509 end if
2510 end if
2511 end do
2512
2513 ! Verifying that a valid decomposition of the computational domain has been established. If not, the simulation
2514 ! exits.
2515 if (proc_rank == 0 .and. ierr == -1) then
2516 call s_mpi_abort('Unsupported combination of values ' // 'of num_procs, m, n and ' &
2517 & // 'weno/muscl/igr_order. Exiting.')
2518 end if
2519
2520 ! Creating new communicator using the Cartesian topology
2521 call mpi_cart_create(mpi_comm_world, 2, (/num_procs_x, num_procs_y/), (/.true., .true./), .false., mpi_comm_cart, &
2522 & ierr)
2523
2524 ! Finding the Cartesian coordinates of the local process
2525 call mpi_cart_coords(mpi_comm_cart, proc_rank, 2, proc_coords, ierr)
2526 end if
2527
2528 ! Global Parameters for y-direction
2529
2530 ! Number of remaining cells
2531 rem_cells = mod(n + 1, num_procs_y)
2532
2533 ! Optimal number of cells per processor
2534 n = (n + 1)/num_procs_y - 1
2535
2536 ! Distributing the remaining cells
2537 do i = 1, rem_cells
2538 if (proc_coords(2) == i - 1) then
2539 n = n + 1; exit
2540 end if
2541 end do
2542
2543 ! Boundary condition at the beginning
2544 if (proc_coords(2) > 0 .or. (bc_y%beg == bc_periodic .and. num_procs_y > 1)) then
2545 proc_coords(2) = proc_coords(2) - 1
2546 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_y%beg, ierr)
2547 proc_coords(2) = proc_coords(2) + 1
2548 end if
2549
2550 ! Boundary condition at the end
2551 if (proc_coords(2) < num_procs_y - 1 .or. (bc_y%end == bc_periodic .and. num_procs_y > 1)) then
2552 proc_coords(2) = proc_coords(2) + 1
2553 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_y%end, ierr)
2554 proc_coords(2) = proc_coords(2) - 1
2555 end if
2556
2557#ifdef MFC_POST_PROCESS
2558 ! Ghost zone at the beginning
2559 if (proc_coords(2) > 0 .and. format == format_silo) then
2560 offset_y%beg = 2
2561 else
2562 offset_y%beg = 0
2563 end if
2564
2565 ! Ghost zone at the end
2566 if (proc_coords(2) < num_procs_y - 1 .and. format == format_silo) then
2567 offset_y%end = 2
2568 else
2569 offset_y%end = 0
2570 end if
2571#endif
2572
2573 ! Beginning and end sub-domain boundary locations
2574 if (parallel_io) then
2575 if (proc_coords(2) < rem_cells) then
2576 start_idx(2) = (n + 1)*proc_coords(2)
2577 else
2578 start_idx(2) = (n + 1)*proc_coords(2) + rem_cells
2579 end if
2580 else
2581#ifdef MFC_PRE_PROCESS
2582 if (old_grid .neqv. .true.) then
2583 dy = (y_domain%end - y_domain%beg)/real(n_glb + 1, wp)
2584
2585 if (proc_coords(2) < rem_cells) then
2586 y_domain%beg = y_domain%beg + dy*real((n + 1)*proc_coords(2))
2587 y_domain%end = y_domain%end - dy*real((n + 1)*(num_procs_y - proc_coords(2) - 1) - (num_procs_y &
2588 & - rem_cells))
2589 else
2590 y_domain%beg = y_domain%beg + dy*real((n + 1)*proc_coords(2) + rem_cells)
2591 y_domain%end = y_domain%end - dy*real((n + 1)*(num_procs_y - proc_coords(2) - 1))
2592 end if
2593 end if
2594#endif
2595 end if
2596
2597 ! 1D Cartesian Processor Topology
2598 else
2599 ! Optimal processor topology
2600 num_procs_x = num_procs
2601
2602 ! Creating new communicator using the Cartesian topology
2603 call mpi_cart_create(mpi_comm_world, 1, (/num_procs_x/), (/.true./), .false., mpi_comm_cart, ierr)
2604
2605 ! Finding the Cartesian coordinates of the local process
2606 call mpi_cart_coords(mpi_comm_cart, proc_rank, 1, proc_coords, ierr)
2607 end if
2608
2609 ! Global Parameters for x-direction
2610
2611 ! Number of remaining cells
2612 rem_cells = mod(m + 1, num_procs_x)
2613
2614 ! Optimal number of cells per processor
2615 m = (m + 1)/num_procs_x - 1
2616
2617 ! Distributing the remaining cells
2618 do i = 1, rem_cells
2619 if (proc_coords(1) == i - 1) then
2620 m = m + 1; exit
2621 end if
2622 end do
2623
2624 call s_update_cell_bounds(cells_bounds, m, n, p)
2625
2626 ! Boundary condition at the beginning
2627 if (proc_coords(1) > 0 .or. (bc_x%beg == bc_periodic .and. num_procs_x > 1)) then
2628 proc_coords(1) = proc_coords(1) - 1
2629 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_x%beg, ierr)
2630 proc_coords(1) = proc_coords(1) + 1
2631 end if
2632
2633 ! Boundary condition at the end
2634 if (proc_coords(1) < num_procs_x - 1 .or. (bc_x%end == bc_periodic .and. num_procs_x > 1)) then
2635 proc_coords(1) = proc_coords(1) + 1
2636 call mpi_cart_rank(mpi_comm_cart, proc_coords, bc_x%end, ierr)
2637 proc_coords(1) = proc_coords(1) - 1
2638 end if
2639
2640#ifdef MFC_POST_PROCESS
2641 ! Ghost zone at the beginning
2642 if (proc_coords(1) > 0 .and. format == format_silo) then
2643 offset_x%beg = 2
2644 else
2645 offset_x%beg = 0
2646 end if
2647
2648 ! Ghost zone at the end
2649 if (proc_coords(1) < num_procs_x - 1 .and. format == format_silo) then
2650 offset_x%end = 2
2651 else
2652 offset_x%end = 0
2653 end if
2654#endif
2655
2656 ! Beginning and end sub-domain boundary locations
2657 if (parallel_io) then
2658 if (proc_coords(1) < rem_cells) then
2659 start_idx(1) = (m + 1)*proc_coords(1)
2660 else
2661 start_idx(1) = (m + 1)*proc_coords(1) + rem_cells
2662 end if
2663 else
2664#ifdef MFC_PRE_PROCESS
2665 if (old_grid .neqv. .true.) then
2666 dx = (x_domain%end - x_domain%beg)/real(m_glb + 1, wp)
2667
2668 if (proc_coords(1) < rem_cells) then
2669 x_domain%beg = x_domain%beg + dx*real((m + 1)*proc_coords(1))
2670 x_domain%end = x_domain%end - dx*real((m + 1)*(num_procs_x - proc_coords(1) - 1) - (num_procs_x - rem_cells))
2671 else
2672 x_domain%beg = x_domain%beg + dx*real((m + 1)*proc_coords(1) + rem_cells)
2673 x_domain%end = x_domain%end - dx*real((m + 1)*(num_procs_x - proc_coords(1) - 1))
2674 end if
2675 end if
2676#endif
2677 end if
2678#endif
2679
2681
2682 !> The goal of this procedure is to populate the buffers of the grid variables by communicating with the neighboring processors.
2683 !! Note that only the buffers of the cell-width distributions are handled in such a way. This is because the buffers of
2684 !! cell-boundary locations may be calculated directly from those of the cell-width distributions.
2685#ifndef MFC_PRE_PROCESS
2686 subroutine s_mpi_sendrecv_grid_variables_buffers(mpi_dir, pbc_loc)
2687
2688 integer, intent(in) :: mpi_dir
2689 integer, intent(in) :: pbc_loc
2690
2691#ifdef MFC_MPI
2692 integer :: ierr !< Generic flag used to identify and report MPI errors
2693
2694 if (mpi_dir == 1) then
2695 if (pbc_loc == -1) then ! PBC at the beginning
2696
2697 if (bc_x%end >= 0) then ! PBC at the beginning and end
2698 call mpi_sendrecv(dx(m - buff_size + 1), buff_size, mpi_p, bc_x%end, 0, dx(-buff_size), buff_size, mpi_p, &
2699 & bc_x%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2700 else ! PBC at the beginning only
2701 call mpi_sendrecv(dx(0), buff_size, mpi_p, bc_x%beg, 1, dx(-buff_size), buff_size, mpi_p, bc_x%beg, 0, &
2702 & mpi_comm_world, mpi_status_ignore, ierr)
2703 end if
2704 else ! PBC at the end
2705 if (bc_x%beg >= 0) then ! PBC at the end and beginning
2706 call mpi_sendrecv(dx(0), buff_size, mpi_p, bc_x%beg, 1, dx(m + 1), buff_size, mpi_p, bc_x%end, 1, &
2707 & mpi_comm_world, mpi_status_ignore, ierr)
2708 else ! PBC at the end only
2709 call mpi_sendrecv(dx(m - buff_size + 1), buff_size, mpi_p, bc_x%end, 0, dx(m + 1), buff_size, mpi_p, &
2710 & bc_x%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2711 end if
2712 end if
2713 else if (mpi_dir == 2) then
2714 if (pbc_loc == -1) then ! PBC at the beginning
2715
2716 if (bc_y%end >= 0) then ! PBC at the beginning and end
2717 call mpi_sendrecv(dy(n - buff_size + 1), buff_size, mpi_p, bc_y%end, 0, dy(-buff_size), buff_size, mpi_p, &
2718 & bc_y%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2719 else ! PBC at the beginning only
2720 call mpi_sendrecv(dy(0), buff_size, mpi_p, bc_y%beg, 1, dy(-buff_size), buff_size, mpi_p, bc_y%beg, 0, &
2721 & mpi_comm_world, mpi_status_ignore, ierr)
2722 end if
2723 else ! PBC at the end
2724 if (bc_y%beg >= 0) then ! PBC at the end and beginning
2725 call mpi_sendrecv(dy(0), buff_size, mpi_p, bc_y%beg, 1, dy(n + 1), buff_size, mpi_p, bc_y%end, 1, &
2726 & mpi_comm_world, mpi_status_ignore, ierr)
2727 else ! PBC at the end only
2728 call mpi_sendrecv(dy(n - buff_size + 1), buff_size, mpi_p, bc_y%end, 0, dy(n + 1), buff_size, mpi_p, &
2729 & bc_y%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2730 end if
2731 end if
2732 else
2733 if (pbc_loc == -1) then ! PBC at the beginning
2734
2735 if (bc_z%end >= 0) then ! PBC at the beginning and end
2736 call mpi_sendrecv(dz(p - buff_size + 1), buff_size, mpi_p, bc_z%end, 0, dz(-buff_size), buff_size, mpi_p, &
2737 & bc_z%beg, 0, mpi_comm_world, mpi_status_ignore, ierr)
2738 else ! PBC at the beginning only
2739 call mpi_sendrecv(dz(0), buff_size, mpi_p, bc_z%beg, 1, dz(-buff_size), buff_size, mpi_p, bc_z%beg, 0, &
2740 & mpi_comm_world, mpi_status_ignore, ierr)
2741 end if
2742 else ! PBC at the end
2743 if (bc_z%beg >= 0) then ! PBC at the end and beginning
2744 call mpi_sendrecv(dz(0), buff_size, mpi_p, bc_z%beg, 1, dz(p + 1), buff_size, mpi_p, bc_z%end, 1, &
2745 & mpi_comm_world, mpi_status_ignore, ierr)
2746 else ! PBC at the end only
2747 call mpi_sendrecv(dz(p - buff_size + 1), buff_size, mpi_p, bc_z%end, 0, dz(p + 1), buff_size, mpi_p, &
2748 & bc_z%end, 1, mpi_comm_world, mpi_status_ignore, ierr)
2749 end if
2750 end if
2751 end if
2752#endif
2753
2755#endif
2756
2757 !> Module deallocation and/or disassociation procedures
2759
2760#ifdef MFC_MPI
2761 deallocate (buff_send, buff_recv)
2762#endif
2763
2764 end subroutine s_finalize_mpi_common_module
2765
2766end module m_mpi_common
type(scalar_field), dimension(sys_size), intent(inout) q_cons_vf
integer, intent(in) j
Compile-time constant parameters: default values, tolerances, and physical constants.
integer, parameter format_silo
integer, parameter nnode
Number of QBMM nodes.
integer, parameter recon_type_weno
Shared derived types for field data, patch geometry, bubble dynamics, and MPI I/O structures.
Global parameters for the post-process: domain geometry, equation of state, and output database setti...
integer buff_size
Number of ghost cells for boundary condition storage.
type(chemistry_parameters) chem_params
type(cell_num_bounds) cells_bounds
Utility routines for bubble model setup, coordinate transforms, array sampling, and special functions...
MPI communication layer: domain decomposition, halo exchange, reductions, and parallel I/O setup.
impure subroutine s_mpi_abort(prnt, code)
The subroutine terminates the MPI execution environment.
impure subroutine s_initialize_mpi_common_module
Initialize the module.
impure subroutine s_mpi_gather_data(my_vector, counts, gathered_vector, root)
Gather variable-length real vectors from all MPI ranks onto the root process.
impure subroutine s_mpi_barrier
Halts all processes until all have reached barrier.
impure subroutine s_mpi_initialize
Initialize the MPI execution environment and query the number of processors and local rank.
impure subroutine s_mpi_allreduce_vectors_sum(var_loc, var_glb, num_vectors, vector_length)
Reduce an array of vectors to their global sums across all MPI ranks.
real(wp), dimension(:), allocatable, private buff_recv
Primitive variable receive buffer for halo exchange.
impure subroutine s_initialize_mpi_data(q_cons_vf, ib_markers, beta)
Set up MPI I/O data views and variable pointers for parallel file output.
impure subroutine s_mpi_reduce_maxloc(var_loc)
Reduce a 2-element variable to its global maximum value with the owning processor rank (MPI_MAXLOC)....
subroutine s_mpi_sendrecv_grid_variables_buffers(mpi_dir, pbc_loc)
The goal of this procedure is to populate the buffers of the grid variables by communicating with the...
impure subroutine s_mpi_reduce_stability_criteria_extrema(icfl_max_loc, vcfl_max_loc, rc_min_loc, icfl_max_glb, vcfl_max_glb, rc_min_glb)
The goal of this subroutine is to determine the global extrema of the stability criteria in the compu...
impure subroutine s_mpi_allreduce_sum(var_loc, var_glb)
Reduce a local real value to its global sum across all MPI ranks.
real(wp), dimension(:), allocatable, private buff_send
Primitive variable send buffer for halo exchange.
impure subroutine s_mpi_allreduce_min(var_loc, var_glb)
Reduce a local real value to its global minimum across all MPI ranks.
impure subroutine s_prohibit_abort(condition, message)
Print a case file error with the prohibited condition and message, then abort execution.
impure subroutine s_mpi_finalize
The subroutine finalizes the MPI execution environment.
subroutine s_initialize_mpi_data_ds(q_cons_vf)
Set up MPI I/O data views for downsampled (coarsened) parallel file output.
impure subroutine s_mpi_allreduce_max(var_loc, var_glb)
Reduce a local real value to its global maximum across all MPI ranks.
impure subroutine s_mpi_allreduce_integer_sum(var_loc, var_glb)
Reduce a local integer value to its global sum across all MPI ranks.
subroutine s_mpi_sendrecv_variables_buffers(q_comm, mpi_dir, pbc_loc, nvar, pb_in, mv_in, q_t_sf)
The goal of this procedure is to populate the buffers of the cell-average conservative variables by c...
integer, private v_size
impure subroutine mpi_bcast_time_step_values(proc_time, time_avg)
Gather per-rank time step wall-clock times onto rank 0 for performance reporting.
impure subroutine s_mpi_reduce_min(var_loc)
Reduce a local real value to its global minimum across all ranks.
impure subroutine s_finalize_mpi_common_module
Module deallocation and/or disassociation procedures.
integer(kind=8) halo_size
subroutine s_mpi_decompose_computational_domain
Decompose the computational domain among processors by balancing cells per rank in each coordinate di...
NVIDIA NVTX profiling API bindings for GPU performance instrumentation.
Definition m_nvtx.f90:6