forrtl: severe (174): SIGSEGV, segmentation fault occurred

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
agpc
Posts: 63
Joined: Mon Jul 27, 2020 7:44 pm
Location: Applied Geophysics Center (AGPC)

forrtl: severe (174): SIGSEGV, segmentation fault occurred

#1 Unread post by agpc »

Dear everyone,
I'm a beginner of COAWST model. Currently, I'm running COAWST (WRF+ROMS+SWAN) with nonesting option.
In this case, I used WRF with 2 domains, ROMS and SWAN with 1 domain.
My error is displayed below:

Code: Select all

[cloud@igplogin COAWST_Nonesting]$ mpirun -np 9 ./coawstM Projects/Sarika/coupling_sarika.in 
 Coupled Input File name = Projects/Sarika/coupling_sarika.in
 Coupled Input File name = Projects/Sarika/coupling_sarika.in

 Model Coupling: 


       Ocean Model MPI nodes: 000 - 003

       Waves Model MPI nodes: 004 - 004

       Atmos Model MPI nodes: 005 - 008

       WAVgrid 01 dt= 180.0 -to- OCNgrid 01 dt=   4.0, CplInt:  1800.0 Steps: 010

       OCNgrid 01 dt=   4.0 -to- WAVgrid 01 dt= 180.0, CplInt:  1800.0 Steps: 450

       ATMgrid 01 dt=  54.0 -to- OCNgrid 01 dt=   4.0, CplInt:  1800.0 Steps: 100

       OCNgrid 01 dt=   4.0 -to- ATMgrid 01 dt=  54.0, CplInt:  1800.0 Steps: ***

       ATMgrid 02 dt=  18.0 -to- OCNgrid 01 dt=   4.0, CplInt:  1800.0 Steps: 100

       OCNgrid 01 dt=   4.0 -to- ATMgrid 02 dt=  18.0, CplInt:  1800.0 Steps: 450

       ATMgrid 01 dt=  54.0 -to- WAVgrid 01 dt= 180.0, CplInt:  1800.0 Steps: 100

       WAVgrid 01 dt= 180.0 -to- ATMgrid 01 dt=  54.0, CplInt:  1800.0 Steps: 030

       ATMgrid 02 dt=  18.0 -to- WAVgrid 01 dt= 180.0, CplInt:  1800.0 Steps: 100

       WAVgrid 01 dt= 180.0 -to- ATMgrid 02 dt=  18.0, CplInt:  1800.0 Steps: 010
--------------------------------------------------------------------------------
 Model Input Parameters:  ROMS/TOMS version 3.8  
                          Saturday - November 28, 2020 -  2:17:46 PM
--------------------------------------------------------------------------------
 module_io_quilt_old.F        2931 F

SWAN grid   1 is preparing computation

 module_io_quilt_old.F        2931 F
 module_io_quilt_old.F        2931 F
 module_io_quilt_old.F        2931 F
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
Quilting with   1 groups of   0 I/O tasks.
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
 Ntasks in X            2 , ntasks in Y            2
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
WRF V4.1.5 MODEL
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme           -4         157         104         226
 ips,ipe,jps,jpe            1         150         111         221
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme           -4         157         104         226
 ips,ipe,jps,jpe            1         150         111         221
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme           -4         157          -4         117
 ips,ipe,jps,jpe            1         150           1         110
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme          144         306          -4         117
 ips,ipe,jps,jpe          151         301           1         110
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme          144         306         104         226
 ips,ipe,jps,jpe          151         301         111         221
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme           -4         157          -4         117
 ips,ipe,jps,jpe            1         150           1         110
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme          144         306          -4         117
 ips,ipe,jps,jpe          151         301           1         110
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
 *************************************
 Parent domain
 ids,ide,jds,jde            1         301           1         221
 ims,ime,jms,jme          144         306         104         226
 ips,ipe,jps,jpe          151         301         111         221
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
coawstM            0000000003C4696A  Unknown               Unknown  Unknown
libpthread-2.17.s  00002B09C2A475D0  Unknown               Unknown  Unknown
coawstM            000000000092471A  edit_file_struct_         212  edit_multifile.f90
coawstM            000000000090A307  read_phypar_             2198  read_phypar.f90
coawstM            000000000078E749  inp_par_                   87  inp_par.f90
coawstM            00000000004426D9  ocean_control_mod          90  ocean_control.f90
coawstM            000000000041E93B  MAIN__                    373  master.f90
coawstM            0000000000418AA2  Unknown               Unknown  Unknown
libc-2.17.so       00002B09C317C495  __libc_start_main     Unknown  Unknown
coawstM            00000000004189A9  Unknown               Unknown  Unknown
Please help me to address it.
Many thanks!
Attachments
namelist.input.txt
(4.67 KiB) Downloaded 186 times
Bound_spec_command.txt
(432 Bytes) Downloaded 184 times
swan_sarika.in
(4.1 KiB) Downloaded 189 times
ocean_sarika.in
(149.63 KiB) Downloaded 192 times
coupling_sarika.in
(8.44 KiB) Downloaded 180 times

jcwarner
Posts: 1179
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#2 Unread post by jcwarner »

i looked at your ocean.in and there are a lot of TABS in there. if dont know if this is exactly the problem, but i remember other people had trouble when there are tabs in the ocean.in.
You need to clean up that file. there are editing tools to do this. then give it another try.
for coawst related issues, you can post those on our github site:
https://github.com/jcwarner-usgs/COAWST
-j

agpc
Posts: 63
Joined: Mon Jul 27, 2020 7:44 pm
Location: Applied Geophysics Center (AGPC)

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#3 Unread post by agpc »

Hi John,
Thank you for your suggestion, however when I deleted all of TAB, the error still occurred.
Can you kindly give me any suggestions?
I also post my issue on your github site.
Many thanks!

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#4 Unread post by kate »

When you get a seg fault, you can recompile with USE_DEBUG to get line numbers in the output. You then need to look at the relevant lines in *your* .f90 files.

agpc
Posts: 63
Joined: Mon Jul 27, 2020 7:44 pm
Location: Applied Geophysics Center (AGPC)

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#5 Unread post by agpc »

Hi Kate,
I'm sorry if my question is quite stupid. I mean your suggestion is that I recompile model with USE_DEBUG option in coawst.bash file, isn't it? I'm a beginner, so I know not much about this model.
Many thanks!
-manh

jcwarner
Posts: 1179
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#6 Unread post by jcwarner »

1. Yes, Kate is suggesting you edit the coawst.bash (or roms.bash) and set
USE_DEBUG=on
2. can you post your cleaned ocean.in
3. the error was
coawstM 000000000092471A edit_file_struct_ 212 edit_multifile.f90
coawstM 000000000090A307 read_phypar_ 2198 read_phypar.f90
so if you edit Build/read_phypar.f90, what is happening near lines 2198?

-j

agpc
Posts: 63
Joined: Mon Jul 27, 2020 7:44 pm
Location: Applied Geophysics Center (AGPC)

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#7 Unread post by agpc »

Hi all,
1. I had a problem when recompiling COAWST model with USE_DEBUG option but it failded.
The error is that:

Code: Select all

fort: warning #10182: disabling optimization; runtime debug checks enabled
swanpre1.f90(4637): error #7938: Character length argument mismatch.   ['(I6)']
         CHARS(1) = NUMSTR(ISTAT,RNAN,'(I6)')                             40.41
--------------------------------------^
compilation aborted for swanpre1.f90 (code 1)
make: *** [Build/swanpre1.o] Error 1
[cloud@igplogin COAWST9]$ exit
I also attached my build.log file in this email.
2. Regarding to my ocean.in file, it is also included in the below of my post.
Please help me to fix it.
Thank again,
-m
Attachments
coawst.bash
(20.85 KiB) Downloaded 191 times
Linux-ifort.mk
(14.33 KiB) Downloaded 169 times
build.log
(1.81 MiB) Downloaded 186 times
sandy.h
(2.22 KiB) Downloaded 187 times
Linux-ifort.mk
(14.33 KiB) Downloaded 185 times

jcwarner
Posts: 1179
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#8 Unread post by jcwarner »

ok. SWAN has some issues with the arrays, and depending on the debug flags, it will stop compiling. swan is not the problem, so lets not go down this road.
the problem you were having is with roms. it stopped when it tried to close a file.
how about you setup the application to only use roms. then compile it in debug mode, and run that.
sorry for all this round about, but sometimes this is what it takes to figure it out.
-j

agpc
Posts: 63
Joined: Mon Jul 27, 2020 7:44 pm
Location: Applied Geophysics Center (AGPC)

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#9 Unread post by agpc »

Hi everyone,
Thank you for your help. My problem is definded in the differences of the Theta_S, Theta_B, minimum depth of ROMS grid and ROMS namelist file. After I recreate all of the input and namelist file, unfortunately it reported that:
Found Error: 01 Line: 351 Source: ROMS/Nonlinear/main3d.F
Found Error: 01 Line: 332 Source: ROMS/Drivers/nl_ocean.h
When I showed the line in these files, it was:
IF (FoundError(exit_flag, NoError, __LINE__, &
& __FILE__)) RETURN
Can anyone kindly help me to solve this issue? Thanks a lot!!
-manh
Attachments
COAWST.out.txt
(120.49 KiB) Downloaded 177 times
swan_sarika.in
(4.04 KiB) Downloaded 183 times
ocean_sarika.in
(147.61 KiB) Downloaded 188 times

jcwarner
Posts: 1179
Joined: Wed Dec 31, 2003 6:16 pm
Location: USGS, USA

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurred

#10 Unread post by jcwarner »

you need to look more closely at the whole output. The coupled simulation ran for 0.5 hours:

.......
622479824 2174-09-15 00:29:52.00 5.370047E-03 2.261209E+04 2.261210E+04 4.180266E+16
(191,053,16) 1.041893E-05 2.883283E-04 1.864959E-01 1.586292E+00
622479825 2174-09-15 00:30:00.00 5.370267E-03 2.261209E+04 2.261209E+04 4.180266E+16
(191,053,16) 1.050728E-05 2.884169E-04 1.925103E-01 1.586371E+00
== SWAN grid 1 sent wave data to ROMS grid 1
** ROMS grid 1 recv data from SWAN grid 1
SWANtoROMS Min/Max DISBOT (Wm-2): 0.000000E+00 8.340169E+24
SWANtoROMS Min/Max DISSURF (Wm-2): 0.000000E+00 8.340169E+24
SWANtoROMS Min/Max DISWCAP (Wm-2): 0.000000E+00 8.340169E+24
SWANtoROMS Min/Max HSIGN (m): 0.000000E+00 8.548673E+27
SWANtoROMS Min/Max RTP (s): 0.000000E+00 8.548673E+27
SWANtoROMS Min/Max TMBOT (s): 0.000000E+00 8.548673E+27
SWANtoROMS Min/Max DIR (rad): 0.000000E+00 6.283185E+00
SWANtoROMS Min/Max DIRP (rad): 0.000000E+00 6.283185E+00
SWANtoROMS Min/Max WLEN (m): 1.000000E+00 5.000000E+02
SWANtoROMS Min/Max WLENP (m): 1.000000E+00 5.000000E+02
** ROMS grid 1 sent data to SWAN grid 1
== SWAN grid 1 recv data from ROMS grid 1
ROMStoSWAN Min/Max DEPTH (m): 0.000000E+00 9.923031E+03
ROMStoSWAN Min/Max WLEV (m): -1.065920E+00 3.119625E+00
ROMStoSWAN Min/Max VELX (ms-1): -1.354910E+00 1.365099E+00
ROMStoSWAN Min/Max VELY (ms-1): -1.467828E+00 1.490027E+00
ROMStoSWAN Min/Max ZO (m): 0.000000E+00 5.000000E-02
622479826 2174-09-15 00:30:08.00 NaN NaN NaN NaN
(000,000,00) 0.000000E+00 0.000000E+00 0.000000E+00 NaN
Found Error: 01 Line: 351 Source: ROMS/Nonlinear/main3d.F
Found Error: 01 Line: 332 Source: ROMS/Drivers/nl_ocean.h

Elapsed CPU time (seconds):

Node # 0 CPU: 291.833
Total: 1168.003


Looks like the data of SWAN to ROMS is rather large. something is going on in SWAN. Some next steps would be:
- look at the roms his file and see what the fields look like.
-rerun and output more frequently (every 5 min or so, just until you figure out what is wrong).
-run just roms
-run just roms + swan
-run roms + swan + wrf
you will need to dig in and see what is going on.

Post Reply