﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
486	Added new Courant number diagnostics to standard output	arango	arango	"The standard output information has been expanded to include the maximum Courant number values and its location. The Courant number in 3D applications is defined at RHO-points as:

{{{
    C = Cu + Cv + Cw

where

    Cu = ABS(u(i,j,k) + u(i+1,j,k)) * dt * pm(i,j)

    Cv = ABS(v(i,j,k) + v(i,j+1,k)) * dt * pn(i,j)

    Cw = ABS(wvel(i,j,k-1) + wvel(i,j,k)) * dt * Hz(i,j,k)
}}}

The '''(i,j,k)''' location where the maximum of '''C''' occurs is also reported.  Finally, the maximum value of the horizontal speed, '''SQRT(u*u + v*v)''', is also reported.  This information may be useful to monitor the model CFL condition. It also indicates the places in the grid where the CFL is violated during model blow-up.

For example, we get in standard output:

{{{

   STEP   Day HH:MM:SS  KINETIC_ENRG   POTEN_ENRG    TOTAL_ENRG    NET_VOLUME
          C => (i,j,k)       Cu            Cv            Cw         Max Speed

    960    40 00:00:00  8.657897E-04  1.948261E+04  1.948261E+04  2.182663E+17
          (096,125,01)  9.919103E-03  4.299619E-03  1.211318E-02  1.049095E+00
}}}
Notice that this information is written after the global energy and volume diagnostics. The total value of '''C''' is not written in 3D applications, but it can be obtained by summing the values of '''Cu''', '''Cv''', and '''Cw'''.  This information also gives you the clue of which component of '''C''' is getting is the largest.  In some applications with rotated tensors the value of '''Cw''' may get large and the time-step needs to be reduced to avoid the model to blow-up.

The total value of '''C''' is reported in 2D application, since we have an information column available (no '''Cw''' value is possible):

{{{
   STEP   Day HH:MM:SS  KINETIC_ENRG   POTEN_ENRG    TOTAL_ENRG    NET_VOLUME
          C => (i,j,k)       Cu            Cv           C Max       Max Speed
}}}

It turns out that computing this simple global diagnostic is very tricky in parallel applications. It required some thinking to do this in shared-memory and serial with partitions. The issue here is that we are computing something equivalent to the intrinsic F90 function '''MAXLOC'''.  As such, we need to get the '''(i,j,k)''' indices of the first occurrence of the maximum value of '''C''' and its qualifiers '''Cu''', '''Cv''', '''Cw''', and '''(i,j,k)'''. It took some serious debugging to get identical values for this new information in serial, serial with partitions, shared-memory, and distributed-memory. Many thanks to Sasha Shchepetkin for his suggestions to compute this.

The final tick to achieve this in shared-memory was to have the following code inside the critical global reduction block:

{{{
!$OMP CRITICAL (NL_DIAGNOSTICS)
        IF (tile_count.eq.0) THEN
          ...
          max_C =my_max_C
          max_Cu=my_max_Cu
          max_Cv=my_max_Cv
          max_Cw=my_max_Cw
          max_Ci=my_max_Ci
          max_Cj=my_max_Cj
          max_Ck=my_max_Ck
        ELSE
          ...
          IF (my_max_C.eq.max_C) THEN       ! very important
            max_Ci=MIN(max_Ci,my_max_Ci)
            max_Cj=MIN(max_Cj,my_max_Cj)
            max_Ck=MIN(max_Ck,my_max_Ck)
          ELSE IF (my_max_C.gt.max_C) THEN
            max_C =my_max_C
            max_Cu=my_max_Cu
            max_Cv=my_max_Cv
            max_Cw=my_max_Cw
            max_Ci=my_max_Ci
            max_Cj=my_max_Cj
            max_Ck=my_max_Ck
          END IF
        END IF
        ...
!$OMP END CRITICAL (NL_DIAGNOSTICS)
}}}

Notice that we need to take the '''minimum value''' of the indices when the maximum value between the threads is the same.  That is, the maximum value is not unique and occurs in several places.  We just need to be sure to take the first occurrence.  This required a lot of debugging. Well, now we know how to do this in the future.

It is possible to get zero values for '''Cu''' or '''Cv''' in land/sea masking applications.  This information is also very useful since sometimes the r-factor values next to the land mask maybe high due to the coordinate stretching and/or the bathymetry cutoff value.

A new routine '''mp_reduce2''' was added to '''distribute.F''' to compute this type of reduction using either '''MPI_MINLOC''' and/or '''MPI_MAXLOC'''. This is computed efficiently by reducing and grouping the calls to '''MPI_ALLREDUCE'''.

Finally, the CFL (Courant-Friedrichs-Lewy) condition is a limiting constraint on the size of the time-step for a particular application.  Recall that in hyperbolic equations '''C must be less or equal to 1''' to decrease oscillations, decrease numerical dispersion, and improve accuracy.

Many thanks to Lyon Lanerolle for suggesting this useful diagnostics.

I also corrected a defect in routine '''netcdf_inq_var''' of file '''mod_netcdf.F''' when reading vector attributes (integer or floating-point) for a particular variable.  Only scalars attribute values are processed ('''my_Alen=1''').  This is OK since such vector attributes are not used internally in ROMS. Now, we have:


{{{
                    IF ((my_Alen.eq.1).and.                             &
     &                  (my_Atype.eq.NF90_INT)) THEN
                      ...

                    ELSE IF ((my_Alen.eq.1).and.                        &
     &                       ((my_Atype.eq.NF90_FLOAT ).or.             &
     &                        (my_Atype.eq.NF90_DOUBLE))) THEN
                      ...
}}}

Some compilers bypassed this problem, but the IBM compiler gives a segmentation violation. Many thanks to Aijun Zhang for reporting this problem."	upgrade	new	major	Release ROMS/TOMS 3.4	Nonlinear	3.4			
