From WikiROMS
Jump to: navigation, search

ROMS uses NetCDF for all its input and output data management. The NetCDF files can be processed using the standard library developed by UNIDATA, the Parallel-IO (PIO) library developed at NCAR, or the Software for Cashing Output and Reads for Parallel I/O (SCORPIO) library intended for the DOE's Energy Exascale Earth Model System (E3SM). The SCORPIO library was forked from the PIO library several years ago and evolved separately. The generic interface for parallel I/O in ROMS works for both the PIO or SCORPIO libraries and available by activating the PIO_LIB CPP option. However, we recommend using the PIO library because it is more efficient in processing I/O in our benchmark tests. Furthermore, another parallel I/O strategy has been available in ROMS for several years with the NetCDF4/HDF5 libraries by activating the PARALLEL_IO and HDF5 CPP options.

Serial I/O

Serial I/O is the standard option that has been in ROMS since the beginning. In this setup, all input and output data flows through the master parallel process. All data for output is collected from all processes by the master process and written to disk. Likewise, all data for input is read by the master process and distributed to the rest of the processes. When using serial I/O, files can be written in NetCDF classic/64-bit offset (NetCDF-3) or NetCDF-4/HDF5 (HDF5 CPP option) formats. File compression is available in the NetCDF-4/HDF5 library.

Parallel I/O with NetCDF-4

Parallel I/O using parallel HDF5 and NetCDF-4 has been available in ROMS for many years. This I/O option requires parallel enabled HDF5 and NetCDF-4 and is activated by defining PARALLEL_IO and HDF5 CPP options. Each parallel tile reads and writes the decomposed data. This approach does not scale well because it requires every process to participate in reading and writing, which quickly overloads the file system with requests as the number of tiles (NtileI x NtileJ) increases.

Parallel I/O with PIO or SCORPIO

PIO and SCORPIO take a different approach by allowing the user to specify which processes will handle I/O. For example, in an HPC cluster environment with Parallel I/O hardware, the user can instruct PIO or SCORPIO to designate which processes per node to perform I/O. This is a much more reasonable approach for larger applications running on hundreds of processors. To use this Parallel I/O strategy, the PIO or SCORPIO library must be linked to ROMS at compile time by defining the PIO_LIB CPP option. It is only available in distributed-memory applications since it uses MPI-IO.

The Parallel I/O configuration options are set in

  • Choose the input and output NetCDF library to use. For example, the user could choose to use the PIO library for writing but still use the standard library for reading.
    ! [1] Standard NetCDF-3 or NetCDF-4 library
    ! [2] Parallel-IO from PIO or SCORPIO library (MPI, MPI-IO applications)

    INP_LIB = 2
    OUT_LIB = 2
  • PIO and SCORPIO offer several methods for reading/writing NetCDF files. SCORPIO also offers ADIOS but that is not implemented in ROMS. Depending on the build of the PIO or SCORPIO libraries, not all the I/O types are available. If the NetCDF library does not support parallel I/O, methods 3 and 4 are not available. Currently, NetCDF4/HDF5 data compression is possible with method 3 during serial write.
    ! [0] parallel read and parallel write of PnetCDF (CDF-5 type files, not recommended because of post-processing)
    ! [1] parallel read and parallel write of NetCDF3 (64-bit offset)
    ! [2] serial read and serial write of NetCDF3 (64-bit offset)
    ! [3] parallel read and serial write of NetCDF4/HDF5
    ! [4] parallel read and parallel write of NETCDF4/HDF5

    PIO_METHOD = 2
  • Parallel-IO tasks control parameters. Typically, it is advantageous and highly recommended to define the I/O decomposition in smaller number of processes for efficiency and to avoid MPI communications bottlenecks.
    PIO_IOTASKS = 1  ! number of I/O processes to define
    PIO_STRIDE = 1  ! stride in the MPI-rank between I/O processes
    PIO_BASE = 0  ! offset for the first I/O process
    PIO_AGGREG = 1  ! number of MPI-aggregators to use
  • Parallel-IO (PIO or SCORPIO) rearranger methods for moving data between computational and I/O processes. It provides the ability to rearrange data between computational and parallel I/O decompositions. Usually the Box rearrangement is more efficient.
    ! [1] Box rearrangement
    ! [2] Subset rearrangement

    PIO_REARR = 1
  • Parallel-IO (PIO or SCORPIO) rearranger flag for MPI communication between computational and I/O processes. In some systems, the Point-to-Point communications is more efficient.
    ! [0] Point-to-Point communications
    ! [1] Collective communications

  • Parallel-IO (PIO or SCORPIO) rearranger flow-control direction flag for MPI communications between computational and I/O processes. Optimally, the MPI communications should be designed to send a modest number of messages evenly distributed across a number of processes. An excessive number of messages to a single MPI process can exhaust the buffer space which can affect efficiency or failure due to the slowdown in the retransmitting of dropped messages. It only sends messages (Isend) when the receiver is ready and has sufficient resources.
    ! [0] Enable computational to I/O processes, and vice versa
    ! [1] Enable computational to I/O processes only
    ! [2] Enable I/O to computational processes only
    ! [3] Disable flow control

  • Parallel-IO (PIO or SCORPIO) rearranger options for MPI communications from computational to I/O processes (C2I).
    PIO_C2I_HS = T  ! Enable C2I handshake (T/F)
    PIO_C2I_Send = F  ! Enable C2I Isends (T/F)
    PIO_C2I_Preq = 64  ! Maximum pending C2I requests
  • Parallel-IO (PIO or SCORPIO) rearranger options for MPI communications from I/O to computational processes (I2C).
    PIO_I2C_HS = F  ! Enable I2C handshake (T/F)
    PIO_I2C_Send = T  ! Enable I2C Isends (T/F)
    PIO_I2C_Preq = 65  ! Maximum pending I2C requests