Opened 7 years ago
Closed 7 years ago
#769 closed upgrade (Done)
VERY IMPORTANT Update: Everyone needs to read provide information
Reported by: | arango | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | Release ROMS/TOMS 3.7 |
Component: | Nonlinear | Version: | 3.7 |
Keywords: | Cc: |
Description
This update contains critical information, and I recommend everybody to consider updating their version:
- While coding and debugging the tangent linear and adjoint of the nesting algorithms, we discovered a two-way bug in the nonlinear model code in telescoping nested applications. Recall that in ROMS, a telescoping grid is a refined grid containing another refined grid inside. For Example:
In the above diagram, grid 2 is the only telescoping grid in this configuration. We need to have a two-way transfer of information from 4 to 2 and 2 to 1 for grid 2 to be considered a telescoping type. The coaser grid 1 is not considered a telescoping grid by definition in ROMS.
In a three-grid refinenemt application for the US east coast, we noticed unnecessary two-way exchanges between telescoping grid 2 and coarser grid 1. See symbol >>>> print in the standard output below:
NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 000000000001 - 000000007200) NL ROMS/TOMS: started time-stepping: (Grid: 02 TimeSteps: 000000000001 - 000000021600) NL ROMS/TOMS: started time-stepping: (Grid: 03 TimeSteps: 000000000001 - 000000043200) TIME-STEP YYYY-MM-DD hh:mm:ss.ss KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME Grid C => (i,j,k) Cu Cv Cw Max Speed 0 2014-01-01 00:00:00.00 2.479705E-02 1.895958E+04 1.895961E+04 2.057379E+15 01 (128,010,40) 6.586887E-02 8.830755E-02 0.000000E+00 2.146392E+00 0 2014-01-01 00:00:00.00 9.826892E-03 1.355795E+04 1.355796E+04 2.279106E+14 02 (067,001,40) 6.153040E-02 3.379189E-02 0.000000E+00 1.549213E+00 0 2014-01-01 00:00:00.00 4.571369E-03 1.116663E+04 1.116663E+04 4.813844E+13 03 (170,008,29) 2.109439E-02 2.133386E-02 0.000000E+00 3.891363E-01 1 2014-01-01 00:01:00.00 4.569719E-03 1.116663E+04 1.116663E+04 4.813848E+13 03 (224,073,01) 1.516512E-02 1.260101E-02 8.206179E-02 3.891982E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 1 2014-01-01 00:02:00.00 9.831911E-03 1.355797E+04 1.355798E+04 2.279109E+14 02 (135,136,01) 5.762565E-03 1.577325E-04 2.528599E-01 1.547482E+00 2 2014-01-01 00:02:00.00 4.574353E-03 1.116663E+04 1.116664E+04 4.813852E+13 03 (222,074,37) 2.382373E-02 1.525561E-02 1.121914E-01 3.892563E-01 3 2014-01-01 00:03:00.00 4.576449E-03 1.116665E+04 1.116665E+04 4.813859E+13 03 (233,065,40) 1.694378E-02 6.053802E-03 1.822713E-01 3.886253E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 >>>> FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 2 2014-01-01 00:04:00.00 9.848035E-03 1.355799E+04 1.355800E+04 2.279112E+14 02 (134,137,40) 2.929985E-03 7.659153E-04 6.330313E-01 1.548821E+00 >>>> FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 4 2014-01-01 00:04:00.00 4.576183E-03 1.116666E+04 1.116667E+04 4.813865E+13 03 (237,066,40) 1.547985E-02 5.251661E-03 2.077931E-01 3.886372E-01 >>>> FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 5 2014-01-01 00:05:00.00 4.578296E-03 1.116668E+04 1.116669E+04 4.813873E+13 03 (240,066,40) 1.518760E-02 4.659222E-03 1.958341E-01 3.885507E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 1 2014-01-01 00:06:00.00 2.472506E-02 1.895960E+04 1.895962E+04 2.057381E+15 01 (031,036,01) 2.730114E-02 1.131899E-02 2.988898E-01 2.164003E+00 3 2014-01-01 00:06:00.00 9.864558E-03 1.355800E+04 1.355801E+04 2.279115E+14 02 (136,140,40) 2.016993E-03 3.375310E-03 5.526394E-01 1.550218E+00 6 2014-01-01 00:06:00.00 4.583192E-03 1.116670E+04 1.116670E+04 4.813881E+13 03 (242,068,40) 1.469385E-02 4.240571E-03 1.770060E-01 3.886908E-01 7 2014-01-01 00:07:00.00 4.589692E-03 1.116671E+04 1.116672E+04 4.813888E+13 03 (214,082,40) 1.952714E-02 3.661358E-03 1.516865E-01 3.892088E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 4 2014-01-01 00:08:00.00 9.875476E-03 1.355809E+04 1.355810E+04 2.279128E+14 02 (136,140,40) 2.392311E-03 2.585103E-03 5.107434E-01 1.552007E+00 8 2014-01-01 00:08:00.00 4.596425E-03 1.116673E+04 1.116673E+04 4.813895E+13 03 (216,085,40) 1.761177E-02 3.653321E-03 1.553607E-01 3.887313E-01 9 2014-01-01 00:09:00.00 4.603149E-03 1.116675E+04 1.116676E+04 4.813905E+13 03 (250,064,40) 1.535948E-02 1.897569E-03 1.583301E-01 3.885450E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 >>>> FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 5 2014-01-01 00:10:00.00 9.884353E-03 1.355818E+04 1.355819E+04 2.279141E+14 02 (136,141,40) 3.595284E-03 2.194098E-03 4.981900E-01 1.553226E+00 >>>> FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 10 2014-01-01 00:10:00.00 4.605400E-03 1.116677E+04 1.116678E+04 4.813915E+13 03 (253,064,40) 1.487150E-02 1.964936E-03 1.542588E-01 3.883007E-01 >>>> FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 11 2014-01-01 00:11:00.00 4.605182E-03 1.116681E+04 1.116681E+04 4.813930E+13 03 (255,065,40) 1.409242E-02 1.498900E-03 1.489003E-01 3.873421E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 2 2014-01-01 00:12:00.00 2.470726E-02 1.895960E+04 1.895962E+04 2.057385E+15 01 (026,035,40) 3.741860E-02 1.691325E-03 5.639773E-01 2.167463E+00Therefore, for each timestep of grid 1 there are three two-way exchanges from grid 2 to 1 in the contact region cr = 2. These redundant exchanges are not a problem in the nonlinear model because grid 1 is waiting for grids 2 and 3 to to reach its current time. They are fatal in the adjoint model. The two-way exchanges are expensive in distributed-memory parallel condigurations because it involves MPI communication that slow the solution affecting performace. However, the bug is that the two-way exchange between grid 2 and 1 is missing the last piece from grid 3. It needs to happen after the exchange between 3 and 2, so grid 1 can advance another timestep (see ++++). This bug does not seem to change the solution that much but it still is wrong. We need to have instead:
NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 000000000001 - 000000007200) NL ROMS/TOMS: started time-stepping: (Grid: 02 TimeSteps: 000000000001 - 000000021600) NL ROMS/TOMS: started time-stepping: (Grid: 03 TimeSteps: 000000000001 - 000000043200) TIME-STEP YYYY-MM-DD hh:mm:ss.ss KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME Grid C => (i,j,k) Cu Cv Cw Max Speed 0 2014-01-01 00:00:00.00 2.479705E-02 1.895958E+04 1.895961E+04 2.057379E+15 01 (128,010,40) 6.586887E-02 8.830755E-02 0.000000E+00 2.146392E+00 0 2014-01-01 00:00:00.00 9.826892E-03 1.355795E+04 1.355796E+04 2.279106E+14 02 (067,001,40) 6.153040E-02 3.379189E-02 0.000000E+00 1.549213E+00 0 2014-01-01 00:00:00.00 4.571369E-03 1.116663E+04 1.116663E+04 4.813844E+13 03 (170,008,29) 2.109439E-02 2.133386E-02 0.000000E+00 3.891363E-01 1 2014-01-01 00:01:00.00 4.569719E-03 1.116663E+04 1.116663E+04 4.813848E+13 03 (224,073,01) 1.516512E-02 1.260101E-02 8.206179E-02 3.891982E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 1 2014-01-01 00:02:00.00 9.831911E-03 1.355797E+04 1.355798E+04 2.279109E+14 02 (135,136,01) 5.762565E-03 1.577325E-04 2.528599E-01 1.547482E+00 2 2014-01-01 00:02:00.00 4.574353E-03 1.116663E+04 1.116664E+04 4.813852E+13 03 (222,074,37) 2.382373E-02 1.525561E-02 1.121914E-01 3.892563E-01 3 2014-01-01 00:03:00.00 4.576449E-03 1.116665E+04 1.116665E+04 4.813859E+13 03 (233,065,40) 1.694378E-02 6.053802E-03 1.822713E-01 3.886253E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 2 2014-01-01 00:04:00.00 9.848035E-03 1.355799E+04 1.355800E+04 2.279112E+14 02 (134,137,40) 2.929985E-03 7.659153E-04 6.330313E-01 1.548821E+00 4 2014-01-01 00:04:00.00 4.576183E-03 1.116666E+04 1.116667E+04 4.813865E+13 03 (237,066,40) 1.547985E-02 5.251661E-03 2.077931E-01 3.886372E-01 5 2014-01-01 00:05:00.00 4.578296E-03 1.116668E+04 1.116669E+04 4.813873E+13 03 (240,066,40) 1.518760E-02 4.659222E-03 1.958341E-01 3.885507E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 ++++ FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 1 2014-01-01 00:06:00.00 2.472503E-02 1.895960E+04 1.895962E+04 2.057381E+15 01 (031,036,01) 2.730114E-02 1.131899E-02 2.988898E-01 2.164003E+00 3 2014-01-01 00:06:00.00 9.864558E-03 1.355800E+04 1.355801E+04 2.279115E+14 02 (136,140,40) 2.016993E-03 3.375310E-03 5.526394E-01 1.550218E+00 6 2014-01-01 00:06:00.00 4.583192E-03 1.116670E+04 1.116670E+04 4.813881E+13 03 (242,068,40) 1.469385E-02 4.240571E-03 1.770060E-01 3.886908E-01 7 2014-01-01 00:07:00.00 4.589692E-03 1.116671E+04 1.116672E+04 4.813888E+13 03 (214,082,40) 1.952714E-02 3.661358E-03 1.516865E-01 3.892088E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 4 2014-01-01 00:08:00.00 9.875476E-03 1.355809E+04 1.355810E+04 2.279128E+14 02 (136,140,40) 2.392311E-03 2.585103E-03 5.107434E-01 1.552007E+00 8 2014-01-01 00:08:00.00 4.596425E-03 1.116673E+04 1.116673E+04 4.813895E+13 03 (216,085,40) 1.761177E-02 3.653321E-03 1.553607E-01 3.887313E-01 9 2014-01-01 00:09:00.00 4.603149E-03 1.116675E+04 1.116676E+04 4.813905E+13 03 (250,064,40) 1.535948E-02 1.897569E-03 1.583301E-01 3.885450E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 5 2014-01-01 00:10:00.00 9.884353E-03 1.355818E+04 1.355819E+04 2.279141E+14 02 (136,141,40) 3.595284E-03 2.194098E-03 4.981900E-01 1.553226E+00 10 2014-01-01 00:10:00.00 4.605400E-03 1.116677E+04 1.116678E+04 4.813915E+13 03 (253,064,40) 1.487150E-02 1.964936E-03 1.542588E-01 3.883007E-01 11 2014-01-01 00:11:00.00 4.605182E-03 1.116681E+04 1.116681E+04 4.813930E+13 03 (255,065,40) 1.409242E-02 1.498900E-03 1.489003E-01 3.873421E-01 FINE2COARSE - exchanging data between grids: dg = 03 and rg = 02 at cr = 04 ++++ FINE2COARSE - exchanging data between grids: dg = 02 and rg = 01 at cr = 02 2 2014-01-01 00:12:00.00 2.470724E-02 1.895960E+04 1.895962E+04 2.057385E+15 01
Once that this bug is corrected, telescoping applications run faster. The gain in efficiency depends on the applicatuion and the number of processors used. We have observed improvements between 6-10 percent.
Therefore, this is a critical update for users running nested applications.
- We also found that additional improvements (around 8 percent) in ROMS solutions compiled with ifort when the option -heap_arrays is removed. However, we need to set the stacksize option to a large value in some computers:
FFLAGS += -Wl,-stack_size,0x64000000
or set the environmental variable stacksize to unlimited in the login script. For example, I have the following command in my .tcshrc:limit stacksize unlimited
If I type the limit UNIX command on a Linux cluster, I get:% limit cputime unlimited filesize unlimited datasize unlimited stacksize unlimited coredumpsize 0 kbytes memoryuse unlimited vmemoryuse unlimited descriptors 1024 memorylocked unlimited maxproc 1024
ROMS has lots of authomatic arrays, so one has the option to allocate those arrays on heap or stack. Usually, the stack option is faster but we need to have enough of it. Otherwise, ROMS will blow-up because memory corruption.
- Corrected the reporting of the longitude and latitude ranges at RHO-points in get_grid.F. Many thanks to John Warner for bringing this to our attention.
- Added MPI broadcasting of 2D and 3D string arrays in distribute.F. Now, we have the following interface for mp_bcasts:
INTERFACE mp_bcasts MODULE PROCEDURE mp_bcasts_0d MODULE PROCEDURE mp_bcasts_1d MODULE PROCEDURE mp_bcasts_2d MODULE PROCEDURE mp_bcasts_3d END INTERFACE mp_bcasts
- Added the reading and writting of generic 2D and 3D strigs to NetCDF files. The mod_netcdf.F now have the following updated intefaces for netcdf_get_svar and netcdf_put_svar:
INTERFACE netcdf_get_svar MODULE PROCEDURE netcdf_get_svar_0d MODULE PROCEDURE netcdf_get_svar_1d MODULE PROCEDURE netcdf_get_svar_2d MODULE PROCEDURE netcdf_get_svar_3d END INTERFACE netcdf_get_svar INTERFACE netcdf_put_fvar INTERFACE netcdf_put_svar MODULE PROCEDURE netcdf_put_svar_0d MODULE PROCEDURE netcdf_put_svar_1d MODULE PROCEDURE netcdf_put_svar_2d MODULE PROCEDURE netcdf_put_svar_3d END INTERFACE netcdf_put_svar
The reasons for this update will be obvious in the future.
- Added new C-preprocessing option IMPLICIT_NUDGING to the momentum radiation boundary conditions in u2dbc_im.F, v2dbc_im.F, u3dbc_im.F, and v3dbc_im.F. The implicit treatment of the nudging term in the radiation equation is more stable but one need to be sure that the land/sea masking does not have one-point bays. Many thanks to Alistar and Kate for suggesting this option.