Difference between revisions of "IO"

From WikiROMS
Jump to navigationJump to search
Line 9: Line 9:
Parallel I/O using parallel HDF5 and NetCDF-4 has been available in ROMS for many years. This I/O option requires parallel enabled HDF5 and NetCDF-4 and is activated by defining [[Options#PARALLEL_IO|PARALLEL_IO]] and [[Options#HDF5|HDF5]] CPP options. This approach does not scale well because it requires every process to participate in reading and writing, which quickly overloads the file system with requests as the number of tiles (NtileI x NtileJ) increase.
Parallel I/O using parallel HDF5 and NetCDF-4 has been available in ROMS for many years. This I/O option requires parallel enabled HDF5 and NetCDF-4 and is activated by defining [[Options#PARALLEL_IO|PARALLEL_IO]] and [[Options#HDF5|HDF5]] CPP options. This approach does not scale well because it requires every process to participate in reading and writing, which quickly overloads the file system with requests as the number of tiles (NtileI x NtileJ) increase.


==Parallel I/O with PIO==
==Parallel I/O with PIO/SCORPIO==
PIO takes a different approach by allowing the user to specify which processes will handle I/O. For example, in an HPC cluster environment, the user can instruct PIO to designate one process per node to perform I/O. This is a much more reasonable approach for larger applications running on many processors. To use PIO the PIO library must be linked to ROMS at compile time by defining the [[Options#PIO_LIB|PIO_LIB]] CPP option. PIO configuration options are set in [[roms.in]] and repeated below.
 
PIO and SCORPIO take a different approach by allowing the user to specify which processes will handle I/O. For example, in an HPC cluster environment, the user can instruct PIO to designate one process per node to perform I/O. This is a much more reasonable approach for larger applications running on many processors. To use PIO the PIO library must be linked to ROMS at compile time by defining the [[Options#PIO_LIB|PIO_LIB]] CPP option. PIO configuration options are set in [[roms.in]] and repeated below.


* Choose the input and output NetCDF library to use. For example, the user could choose to use the PIO library for reading but still use the standard library for writing. <div class="box">!  [1] Standard NetCDF-3 or NetCDF-4 library<br />!  [2] Parallel-IO from PIO or SCORPIO library (MPI, MPI-IO applications)<br /><br />    INP_LIB =  2<br />    OUT_LIB =  2</div>
* Choose the input and output NetCDF library to use. For example, the user could choose to use the PIO library for reading but still use the standard library for writing. <div class="box">!  [1] Standard NetCDF-3 or NetCDF-4 library<br />!  [2] Parallel-IO from PIO or SCORPIO library (MPI, MPI-IO applications)<br /><br />    INP_LIB =  2<br />    OUT_LIB =  2</div>

Revision as of 17:12, 29 April 2021

ROMS I/O

ROMS uses NetCDF for all its input and output data management. The NetCDF files can be processed using the standard library developed by UNIDATA, the Parallel-IO (PIO) library developed at NCAR, or the Software for Cashing Output and Reads for Parallel I/O (SCORPIO) library intended for the DOE's Energy Exascale Earth Model System (E3SM). The SCORPIO library was forked from the PIO library several years ago and evolved separately. The generic interface for parallel I/O in ROMS works for both the PIO or SCORPIO libraries and available by activating the PIO_LIB CPP option. However, we recommend using the PIO library because it is more efficient in processing I/O in our benchmark tests. Furthermore, parallel I/O has been available in ROMS for several years with the NetCDF4/HDF5 libraries by activating the PARALLEL_IO and HDF5 CPP options.

Serial I/O

Serial I/O is the standard option that has been in ROMS since the beginning. In this setup, all input and output data flows through the master MPI process. All data for output is collected from all processes by the master process and written to disk. Likewise, all data for input is read by the master process and distributed to the rest of the processes. When using serial I/O, files can be written in netCDF classic/64-bit offset (netCDF-3) or netCDF-4/HDF5 (HDF5 CPP option) formats.

Parallel I/O with NetCDF-4

Parallel I/O using parallel HDF5 and NetCDF-4 has been available in ROMS for many years. This I/O option requires parallel enabled HDF5 and NetCDF-4 and is activated by defining PARALLEL_IO and HDF5 CPP options. This approach does not scale well because it requires every process to participate in reading and writing, which quickly overloads the file system with requests as the number of tiles (NtileI x NtileJ) increase.

Parallel I/O with PIO/SCORPIO

PIO and SCORPIO take a different approach by allowing the user to specify which processes will handle I/O. For example, in an HPC cluster environment, the user can instruct PIO to designate one process per node to perform I/O. This is a much more reasonable approach for larger applications running on many processors. To use PIO the PIO library must be linked to ROMS at compile time by defining the PIO_LIB CPP option. PIO configuration options are set in roms.in and repeated below.

  • Choose the input and output NetCDF library to use. For example, the user could choose to use the PIO library for reading but still use the standard library for writing.
    ! [1] Standard NetCDF-3 or NetCDF-4 library
    ! [2] Parallel-IO from PIO or SCORPIO library (MPI, MPI-IO applications)

    INP_LIB = 2
    OUT_LIB = 2
  • PIO and SCORPIO offer several methods for reading/writing NetCDF files. SCORPIO also offers ADIOS but that is not implemented in ROMS.
    ! [0] parallel read and write of PnetCDF (CDF-5, not recommended)
    ! [1] parallel read and write of NetCDF3 (64-bit offset)
    ! [2] serial read and write of NetCDF3 (64-bit offset)
    ! [3] parallel read and serial write of NetCDF4/HDF5
    ! [4] parallel read and write of NETCDF4/HDF5

    PIO_METHOD = 2
  • Parallel-IO task control parameters.
    PIO_IOTASKS = 1  ! number of I/O tasks to define
    PIO_STRIDE = 1  ! stride in the MPI-ran between I/O tasks
    PIO_BASE = 0  ! offset for the first I/O task
    PIO_AGGREG = 1  ! number of MPI-aggregators to use
  • Parallel-IO (PIO/SCORPIO) rearranger methods for moving data between computational and I/O processes. Box rearrangement is recommended.
    ! [1] Box rearrangement
    ! [2] Subset rearrangement

    PIO_REARR = 1
  • Parallel-IO (PIO/SCORPIO) rearranger communication between computational and I/O processes. Point-to-point is recommended.
    ! [0] Point-to-Point communications
    ! [1] Collective communications

    PIO_REARRCOM = 0
  • Parallel-IO (PIO/SCORPIO) rearranger communications between computational and I/O flow control directions.
    ! [0] Enable computational to I/O processes, and viceversa
    ! [1] Enable computational to I/O processes only
    ! [2] Enable I/O to computational processes only
    ! [3] Disable flow control

    PIO_REARRDIR = 0
  • Parallel-IO (PIO/SCORPIO) rearranger computational to I/O processes (C2I) options.
    PIO_C2I_HS = T  ! Enable C2I handshake (T/F)
    PIO_C2I_Send = F  ! Enable C2I Isends (T/F)
    PIO_C2I_Preq = 64  ! Maximum pending C2I requests
  • Parallel-IO (PIO/SCORPIO) rearranger I/O to computational processes (I2C) options.
    PIO_I2C_HS = F  ! Enable I2C handshake (T/F)
    PIO_I2C_Send = T  ! Enable I2C Isends (T/F)
    PIO_I2C_Preq = 64  ! Maximum pending I2C requests