ROMS Debug Approach

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
richard.schmalz
Posts: 24
Joined: Thu Oct 04, 2007 4:14 am
Location: NOAA

ROMS Debug Approach

#1 Unread post by richard.schmalz »

Hi All,

As a new user to ROMS, I was wondering what the approach was to debug the model blowing up. I have recently been running an application for the Delaware Estuary and obtained:

ROMS/TOMS -Blows Up .....................exit_flag: 1
Saving the latest model state into RESTART file

MAIN: Abnormal termination: Blowup.

I have several questions:

1. What triggers the BLOWUP are NANs searched and if so for what fields.
2. Can you point me to the appropriate routines and locations within ROMS, where the BLOWUP is detected?
3 Presumeably one examines the restart file, but if NANs are present, it may be difficult to debug.
4. How would one change the BLOWUP criteria to be a condition of maximum velocity exceeding 100 m/s as is done in POM?
5. In ROMS, how does one examine and determine the cause of the BLOWUP?

Any suggestions would be much appreciated. Thanks....Dick Schmalz

richard.schmalz
Posts: 24
Joined: Thu Oct 04, 2007 4:14 am
Location: NOAA

#2 Unread post by richard.schmalz »

In reviewing the code in diag.F, I note that if either the kinetic or potential energy summed over the domain is N or n or * (exceeds 8 places) then the exit flag 1 is triggered and the model Blows Up.

I will consider adding a supplemental check, whereby if either a u or v velocity component exceeds 10 m/s a blow up condition will be triggered. In this manner, when the restart file is written, one may notice, where the velocities are exceeded.

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

#3 Unread post by kate »

Good points. I too have found the writing of the restart record with the NaNs in it to be pretty useless. Oh, for the good old days when the Cray compiler would let you dump core during the actual operation that caused the NaN/Inf so you could see what was going on.

You might also want to check for outrageous values of T and S.

richard.schmalz
Posts: 24
Joined: Thu Oct 04, 2007 4:14 am
Location: NOAA

#4 Unread post by richard.schmalz »

In diag.F, I believe one can add after:

Compute and report out volume averaged kinetic, potential
total energy, and volume.

In 3d inner loop:

if ( ABS(u(i,j,k,nstp)) .gt. 10.0 .or.
ABS(v(i,j,k,nstp)) .gt. 10.0)exit_flag=1

In 2d inner loop:

if ( ABS(ubar(i,j,krhs)) .gt. 10.0 .or.
ABS(vbar(i,j,krhs)) .gt. 10.0)exit_flag=1

In this manner, an exit condition will occur for excessive velocity component values and the necessary variable for debugging
will be written without Nans in the restart file.

richard.schmalz
Posts: 24
Joined: Thu Oct 04, 2007 4:14 am
Location: NOAA

#5 Unread post by richard.schmalz »

I should have included _r8 after the 10.0 constants.

gerardo
Posts: 12
Joined: Wed Sep 27, 2006 7:23 pm
Location: SGI

#6 Unread post by gerardo »

kate wrote: [...] Oh, for the good old days when the Cray compiler would let you dump core during the actual operation that caused the NaN/Inf so you could see what was going on.
You can blame such default behavior (continuing to run way after infinities and NaNs have been produced by operations) on the IEEE standard for floating-point arithmetic, on systems that implement it.

Fortunately, there is a way to recover the Cray-like behavior. Under IRIX, one can set a runtime environment variable (TRAP_FPE) to a string that specifies the behavior for divide by zero, overflow, underflow, etc.; with Intel compilers under Linux (on x86_64 or ia64 processors) you enable it by compiling the main program with the "-fpe0" option.

Other systems or compilers may have other ways to alter the default floating point exception handling.

Saludos,

Gerardo

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

#7 Unread post by kate »

I believe what Richard wants is a way to see the fields after they've gone bad but before they are unplottable NaNs.

Here's one option for getting back the Cray-like behavior of old:
ttp://www.arsc.edu/support/news/HPCnews/HPCne ... l#article3

Post Reply