Segmentation Faults

Frequently Asked Questions about ROMS usage

Moderators: arango, kate, robertson

Post Reply
Message
Author
User avatar
kate
Posts: 3702
Joined: Wed Jul 02, 2003 5:29 pm
Location: IMS/UAF, USA

Segmentation Faults

#1 Post by kate » Wed Jun 06, 2018 6:45 pm

Just had a segmentation fault that I can't figure out at all:

Code: Select all

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
oceanG             00000000035D46E5  Unknown               Unknown  Unknown
oceanG             00000000035D2307  Unknown               Unknown  Unknown
oceanG             000000000357EA64  Unknown               Unknown  Unknown
oceanG             000000000357E876  Unknown               Unknown  Unknown
oceanG             0000000003531296  Unknown               Unknown  Unknown
oceanG             0000000003534E90  Unknown               Unknown  Unknown
libpthread.so.0    00007F62B67E87E0  Unknown               Unknown  Unknown
oceanG             00000000035082A5  nf_fread3d_mod_mp         156  nf_fread3d.f90
oceanG             0000000002EA599B  get_state_                851  get_state.f90
oceanG             0000000000F71736  initial_                  213  initial.f90
oceanG             000000000040C8EF  ocean_control_mod         133  ocean_control.f90
oceanG             000000000040B8B6  MAIN__                     95  master.f90
oceanG             000000000040B68E  Unknown               Unknown  Unknown
libc.so.6          00007F62B53A6D1D  Unknown               Unknown  Unknown
oceanG             000000000040B569  Unknown               Unknown  Unknown
It is failing in the reading of "u", specifically in the floating point attributes of "u". This is a new initial file I made the same way as the last one which ROMS has read many times. The above failure was with ifort, trying again with gfortran doesn't fail at all, so I'm chalking it up to a compiler bug. :?

mathieu
Posts: 74
Joined: Fri Sep 17, 2004 2:22 pm
Location: Institut Rudjer Boskovic

Re: Segmentation Faults

#2 Post by mathieu » Mon Jun 18, 2018 2:34 pm

Hi Kate,
in my experience it has never happened that the compiler was wrong. Bug detected for one compiler but not for the other means bug.
In order to detect the bug in gfortran you can use compilation option -fcheck=all -fsanitize=address -fsanitize=undefined.
For Intel Fortran Compiler, options are -check all -warn interfaces,nouncalled -gen-interface.

There are other options for detecting NaN in the computation.

User avatar
kate
Posts: 3702
Joined: Wed Jul 02, 2003 5:29 pm
Location: IMS/UAF, USA

Re: Segmentation Faults

#3 Post by kate » Fri Jun 22, 2018 12:09 am

Thanks - those "check all" flags are scary! Both compilers warn about creating temporary arrays when reading parameter files (read_phypar, read_stapar, etc).

Ifort still fails in nf_fread3d when calling netcdf_get_fatt.

gfortran now fails in wclock_on because it is a nonrecursive procedure being called recursively (from the mp_barrier in there).

Gfortran without all the checking still runs.

mathieu
Posts: 74
Joined: Fri Sep 17, 2004 2:22 pm
Location: Institut Rudjer Boskovic

Re: Segmentation Faults

#4 Post by mathieu » Fri Jun 22, 2018 3:56 pm

The temporary arrays are when you pass a A(1,:) array to a subroutine. Since the values are not aligned there is a need for a new array which of course slows things down. But it is no problem if done only in the input parameter reading.

It is of course a problem if wclock_on is called recursively. Solution to that is to declare a "RECURSIVE SUBROUTINE".

The fact that the error occurs in netcdf_get_fatt means that the bug happens in the netcdf routine itself. So, two possibilities:

(A) The bug is in the netcdf routine itself (rather unlikely). Then one needs to compile the netcdf itself with check all. Hard work to do that.

(B) Print the input to the function netcdf_get_fatt. Long time ago I had random errors occurring because of pointers erased by a previous call to a function. This pointer erasure can happen before the call to netcdf_get_fatt and create the problem. Since the compilers are free to organize memory as they want this can explain why it can work with gfortran but not for ifort.

User avatar
kate
Posts: 3702
Joined: Wed Jul 02, 2003 5:29 pm
Location: IMS/UAF, USA

Re: Segmentation Faults

#5 Post by kate » Fri Jun 22, 2018 4:56 pm

I'm happy to ignore warnings during initialization.

Thanks, would have gotten to adding the recursive modifier, but had to leave yesterday. The gfortran case is now running past that.

The netcdf_get_fatt thing happens in the debugger when stepping into netcdf_get_fatt from nf_fread3d.
I can see the values of all eight arguments to netcdf_get_fatt and they are all fine. netcdf_get_fatt is a ROMS routine, so I should be able to step into it but no, that's when the error occurs for ifort.

I've been around long enough to believe in compiler bugs, no question.

mathieu
Posts: 74
Joined: Fri Sep 17, 2004 2:22 pm
Location: Institut Rudjer Boskovic

Re: Segmentation Faults

#6 Post by mathieu » Sun Jun 24, 2018 6:00 pm

Hernan, a remark on your point on "wrap-around integer". Actually in Fortran (and C/C++) the integer overflow is undefined behavior. See for example https://stackoverflow.com/questions/405 ... r-overflow
So, gfortran is right to stop at that.

Post Reply