Opened 13 years ago

Closed 13 years ago

#486 closed upgrade (Done)

Added new Courant number diagnostics to standard output

Reported by: arango Owned by: arango
Priority: major Milestone: Release ROMS/TOMS 3.4
Component: Nonlinear Version: 3.4
Keywords: Cc:

Description (last modified by arango)

The standard output information has been expanded to include the maximum Courant number values and its location. The Courant number in 3D applications is defined at RHO-points as:

    C = Cu + Cv + Cw

where

    Cu = ABS(u(i,j,k) + u(i+1,j,k)) * dt * pm(i,j)

    Cv = ABS(v(i,j,k) + v(i,j+1,k)) * dt * pn(i,j)

    Cw = ABS(wvel(i,j,k-1) + wvel(i,j,k)) * dt * Hz(i,j,k)

The (i,j,k) location where the maximum of C occurs is also reported. Finally, the maximum value of the horizontal speed, SQRT(u*u + v*v), is also reported. This information may be useful to monitor the model CFL condition. It also indicates the places in the grid where the CFL is violated during model blow-up.

For example, we get in standard output:

   STEP   Day HH:MM:SS  KINETIC_ENRG   POTEN_ENRG    TOTAL_ENRG    NET_VOLUME
          C => (i,j,k)       Cu            Cv            Cw         Max Speed

    960    40 00:00:00  8.657897E-04  1.948261E+04  1.948261E+04  2.182663E+17
          (096,125,01)  9.919103E-03  4.299619E-03  1.211318E-02  1.049095E+00

Notice that this information is written after the global energy and volume diagnostics. The total value of C is not written in 3D applications, but it can be obtained by summing the values of Cu, Cv, and Cw. This information also gives you the clue of which component of C is getting is the largest. In some applications with rotated tensors the value of Cw may get large and the time-step needs to be reduced to avoid the model to blow-up.

The total value of C is reported in 2D application, since we have an information column available (no Cw value is possible):

   STEP   Day HH:MM:SS  KINETIC_ENRG   POTEN_ENRG    TOTAL_ENRG    NET_VOLUME
          C => (i,j,k)       Cu            Cv           C Max       Max Speed

It turns out that computing this simple global diagnostic is very tricky in parallel applications. It required some thinking to do this in shared-memory and serial with partitions. The issue here is that we are computing something equivalent to the intrinsic F90 function MAXLOC. As such, we need to get the (i,j,k) indices of the first occurrence of the maximum value of C and its qualifiers Cu, Cv, Cw, and (i,j,k). It took some serious debugging to get identical values for this new information in serial, serial with partitions, shared-memory, and distributed-memory. Many thanks to Sasha Shchepetkin for his suggestions to compute this.

The final tick to achieve this in shared-memory was to have the following code inside the critical global reduction block in diag.F:

!$OMP CRITICAL (NL_DIAGNOSTICS)
        IF (tile_count.eq.0) THEN
          ...
          max_C =my_max_C
          max_Cu=my_max_Cu
          max_Cv=my_max_Cv
          max_Cw=my_max_Cw
          max_Ci=my_max_Ci
          max_Cj=my_max_Cj
          max_Ck=my_max_Ck
        ELSE
          ...
          IF (my_max_C.eq.max_C) THEN       ! very important
            max_Ci=MIN(max_Ci,my_max_Ci)
            max_Cj=MIN(max_Cj,my_max_Cj)
            max_Ck=MIN(max_Ck,my_max_Ck)
          ELSE IF (my_max_C.gt.max_C) THEN
            max_C =my_max_C
            max_Cu=my_max_Cu
            max_Cv=my_max_Cv
            max_Cw=my_max_Cw
            max_Ci=my_max_Ci
            max_Cj=my_max_Cj
            max_Ck=my_max_Ck
          END IF
        END IF
        ...
!$OMP END CRITICAL (NL_DIAGNOSTICS)

Notice that we need to take the minimum value of the indices when the maximum value between the threads is the same. That is, the maximum value is not unique and occurs in several places. We just need to be sure to take the first occurrence. This required a lot of debugging. Well, now we know how to do this in the future.

It is possible to get zero values for Cu or Cv in land/sea masking applications. This information is also very useful since sometimes the r-factor values next to the land mask maybe high due to the coordinate stretching and/or the bathymetry cutoff value.

A new routine mp_reduce2 was added to distribute.F to compute this type of reduction using either MPI_MINLOC and/or MPI_MAXLOC. This is computed efficiently by reducing and grouping the calls to MPI_ALLREDUCE.

Finally, the CFL (Courant-Friedrichs-Lewy) condition is a limiting constraint on the size of the time-step for a particular application. Recall that in hyperbolic equations C must be less or equal to 1 to decrease oscillations, decrease numerical dispersion, and improve accuracy.

Many thanks to Lyon Lanerolle for suggesting this useful diagnostics.

I also corrected a defect in routine netcdf_inq_var of file mod_netcdf.F when reading vector attributes (integer or floating-point) for a particular variable. Only scalars attribute values are processed (my_Alen=1). This is OK since such vector attributes are not used internally in ROMS. Now, we have:

                    IF ((my_Alen.eq.1).and.                             &
     &                  (my_Atype.eq.NF90_INT)) THEN
                      ...

                    ELSE IF ((my_Alen.eq.1).and.                        &
     &                       ((my_Atype.eq.NF90_FLOAT ).or.             &
     &                        (my_Atype.eq.NF90_DOUBLE))) THEN
                      ...

Some compilers bypassed this problem, but the IBM compiler gives a segmentation violation. Many thanks to Aijun Zhang for reporting this problem.

Change History (1)

comment:1 by arango, 13 years ago

Description: modified (diff)
Resolution: Done
Status: newclosed
Note: See TracTickets for help on using tickets.