Parallel I∕O in Flexible Modelling System (FMS) and Modular Ocean Model 5 (MOM5)

oleh: R. Yang, M. Ward, M. Ward, B. Evans

Format: Article
Diterbitkan: Copernicus Publications 2020-04-01

Deskripsi

<p>We present an implementation of parallel <span class="inline-formula">I∕O</span> in the Modular Ocean Model (MOM), a numerical ocean model used for climate forecasting, and determine its optimal performance over a range of tuning parameters. Our implementation uses the parallel API of the netCDF library, and we investigate the potential bottlenecks associated with the model configuration, netCDF implementation, the underpinning MPI-IO library/implementations and Lustre filesystem. We investigate the performance of a global 0.25<span class="inline-formula"><sup>∘</sup></span> resolution model using 240 and 960&thinsp;CPUs. The best performance is observed when we limit the number of contiguous <span class="inline-formula">I∕O</span> domains on each compute node and assign one MPI rank to aggregate and to write the data from each node, while ensuring that all nodes participate in writing this data to our Lustre filesystem. These best-performance configurations are then applied to a higher 0.1<span class="inline-formula"><sup>∘</sup></span> resolution global model using 720 and 1440&thinsp;CPUs, where we observe even greater performance improvements. In all cases, the tuned parallel <span class="inline-formula">I∕O</span> implementation achieves much faster write speeds relative to serial single-file <span class="inline-formula">I∕O</span>, with write speeds up to 60 times faster at higher resolutions. Under the constraints outlined above, we observe that the performance scales as the number of compute nodes and <span class="inline-formula">I∕O</span> aggregators are increased, ensuring the continued scalability of <span class="inline-formula">I∕O</span>-intensive MOM5 model runs that will be used in our next-generation higher-resolution simulations.</p>