In ROMS, we have the option to impose a volume conservation in applications with open boundaries if tidal forcing is not anabled:
Code: Select all
! Set lateral open boundary edge volume conservation switch for
! nonlinear model and adjoint-based algorithms. Usually activated
! with radiation boundary conditions to enforce global mass
! conservation, except if tidal forcing is enabled. [1:Ngrids].
   VolCons(west)  ==  T                            ! western  boundary
   VolCons(east)  ==  T                            ! eastern  boundary
   VolCons(south) ==  T                            ! southern boundary
   VolCons(north) ==  F                            ! northern boundary
ad_VolCons(west)  ==  T                            ! western  boundary
ad_VolCons(east)  ==  T                            ! eastern  boundary
ad_VolCons(south) ==  T                            ! southern boundary
ad_VolCons(north) ==  F                            ! northern boundaryThe computation of bc_flux and bc_area are simple but they are subject to round-off in parallel and serial applications with tile partitions. Both values are usually large numbers, specially in big domains and deep bathymetry. This makes the round-off problem worse.
In parallel applications, we need to have a special code to carry out the global reduction operation (summation):
Code: Select all
      IF (ANY(VolCons(:,ng))) THEN
#ifdef DISTRIBUTE
        NSUB=1                           ! distributed-memory
#else
        IF (DOMAIN(ng)%SouthWest_Corner(tile).and.                      &
     &      DOMAIN(ng)%NorthEast_Corner(tile)) THEN
          NSUB=1                         ! non-tiled application
        ELSE
          NSUB=NtileX(ng)*NtileE(ng)     ! tiled application
        END IF
#endif
!$OMP CRITICAL (OBC_VOLUME)
        IF (tile_count.eq.0) THEN
          bc_flux=0.0_r8
          bc_area=0.0_r8
        END IF
        bc_area=bc_area+my_area
        bc_flux=bc_flux+my_flux
        tile_count=tile_count+1
        IF (tile_count.eq.NSUB) THEN
          tile_count=0
#ifdef DISTRIBUTE
          buffer(1)=bc_area
          buffer(2)=bc_flux
          op_handle(1)='SUM'
          op_handle(2)='SUM'
          CALL mp_reduce (ng, iNLM, 2, buffer, op_handle)
          bc_area=buffer(1)
          bc_flux=buffer(2)
#endif
          ubar_xs=bc_flux/bc_area
        END IF
!$OMP END CRITICAL (OBC_VOLUME)
      END IFNotice that in any computer, (A + B + C + D) will not give the same results as (A + D + C + B). Therefore, when we do floating-point parallel reduce operations the order of the reduction between chunks of numbers matters. This order is not deterministic in routine like mpi_allreduce, which is called by the ROMS routine mp_reduce. As matter of fact, there are three different ways in mp_reduce to compute this reduction operation using either mpi_allreduce, mpi_allgather, and mpi_isend/mpi_irecv. All three methods are subject to round-off problems.
What To Do:
- Currently, there is not much that we can do. Sasha suggested to carry the summation-by-pairs to reduce the round-off error. This summation adds only comparable numbers from the chunk of parallel data.
 - If you are using frequent data from larger scale models, you can suppress volume conservation to see what happens. Specially, you need frequent free-surface (zeta) data. You may get into trouble if running for long (years) periods.
 - Impose tidal forcing at the open boundary since we need to avoid volume conservation.
 -  Notice that we not longer can compare NetCDF files byte by byte to check for parallel bugs when volume conservation is activated:
To compute binary differences, you need to activate CPP options: DEBUGGING, OUT_DOUBLE, and POSITIVE_ZERO.
Code: Select all
% diff r1/ocean_avg.nc r2/ocean_avg.nc Binary files r1/ocean_avg.nc and r2/ocean_avg.nc differ % diff r1/ocean_his.nc r2/ocean_his.nc Binary files r1/ocean_his.nc and r2/ocean_his.nc differ % diff r1/ocean_rst.nc r2/ocean_rst.nc Binary files r1/ocean_rst.nc and r2/ocean_rst.nc differ