Weird experience with ROMS3.72

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
pengjia
Posts: 10
Joined: Thu Aug 21, 2008 4:24 pm
Location: UMCP

Weird experience with ROMS3.72

#1 Unread post by pengjia »

Hi all,

I set-upped a Linux box with Intel i7 + Fedora11_64 + ifort11 + mpich2. I could compile and run test case smoothly. Happy! But something weird happens. Whatever I edit in "*.in" file (like NHIS, NINFO, somethings not important) and rerun the previously successful case, it will blow up and NaN values show up within several steps. Then run it again without changing anything, sometimes it works, sometimes it needs the third try.

We have a 16-node cluster with 32 AMD CPUs, Redhat Enter.3 and ifort9. I runned the same case aforementioned (same ROMS version, same CPP options...), but got different results. Take salinity for an example, the discrepancy at some points can be as high as 3psu.

What kind of things do you think cause that. Linux box configuration? Or the ROMS itself? Or different intel fortran? Or platforms and OS do give different results?

Sorry that I'm totally confused and the above may also confuse you.

Thank you.

Peng

linzhenhua
Posts: 64
Joined: Mon Oct 17, 2005 2:02 am
Location: Institute of Oceanology,Chinese Academy of Sciences

Re: Weird experience with ROMS3.72

#2 Unread post by linzhenhua »

Below is part of the discussion from Martin Schmidt to MOM4 mailing list,I guess it may help:
the modern intel architecture allows for several models how floating
point operations are defined. The
default is a "sloppy" mode where speed gain is preferred against
reproducibilty.
Accurate results are obtained with |"-O -fp-model strict". |
|Otherwise results may differ even from repeated runs with the same binary.

pengjia
Posts: 10
Joined: Thu Aug 21, 2008 4:24 pm
Location: UMCP

Re: Weird experience with ROMS3.72

#3 Unread post by pengjia »

That makes much sense. But it's still weird to me that in order to start running a certain case, I have to do double or even triple try. Just now, I submitted one case three times and after twice blowing-ups, it's running smoothly :cry: I don't want to do this sort of things for very case.

You know, I wasted several days trying to figure out what leads my model to blowing up. I checked the boundary, the forcing, the initial... And finally, it ends up having something to do with the machine itself. :oops:

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Weird experience with ROMS3.72

#4 Unread post by kate »

Can you get consistent results using the strict compile flag?

User avatar
arango
Site Admin
Posts: 1350
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Weird experience with ROMS3.72

#5 Unread post by arango »

I get identical results with the -fp-model precise flag. This is the flag that is distributed in configuration files.

Post Reply