what's the error?

Discussion on computers, ROMS installation and compiling

Moderators: arango, robertson

Post Reply
Message
Author
duckweed

what's the error?

#1 Unread post by duckweed »

when i mpirun the oceanM ,the out.log is:


running /upwelling/Forward/oceanM on 4 LINUX ch_p4 processors
Created /upwelling/Forward/PI27445
Process Information:

Node # 0 (pid= 27648) is active.
Node # 2 (pid= 27782) is active.
Node # 1 (pid= 27715) is active.
Node # 3 (pid= 27849) is active.

Model Input Parameters: ROMS/TOMS version 3.0
Sunday - July 12, 2009 - 10:48:08 PM
-----------------------------------------------------------------------------

Wind-Driven Upwelling/Downwelling over a Periodic Channel

Operating system : Linux
CPU/hardware : x86_64
Compiler system : pgi
Compiler command : mpif90
Compiler flags : -fastsse -Mipa=fast -tp k8-64 -Mfree

Input Script : ocean_upwelling.in

SVN Root URL : https://www.myroms.org/svn/src/trunk
SVN Revision :

Local Root : /ROMS
Header Dir : /Forward
Header file : upwelling.h
Analytical Dir: /ROMS/ROMS/Functionals

Resolution, Grid 01: 0041x0080x016, Parallel Nodes: 4, Tiling: 001x001

ROMS/TOMS: Wrong choice of domain 01 partition or number of parallel threads.
NtileI * NtileJ must be equal to the number of parallel nodes.
Change -np value to mpirun or
change domain partition in input script.

Tile partition information for Grid 01: 0041x0080x0016 tiling: 001x001

tile Istr Iend Jstr Jend Npts

0 1 41 1 80 52480

Maximum halo size in XI and ETA directions:

HaloSizeI(1) = 156
HaloSizeJ(1) = 267
TileSide(1) = 83
TileSize(1) = 3818


Activated C-preprocessing Options:

UPWELLING Wind-Driven Upwelling/Downwelling over a Periodic Channel
ANA_BSFLUX Analytical kinematic bottom salinity flux.
ANA_BTFLUX Analytical kinematic bottom temperature flux.
ANA_GRID Analytical grid set-up.
ANA_INITIAL Analytical initial conditions.
ANA_SMFLUX Analytical kinematic surface momentum flux.
ANA_SSFLUX Analytical kinematic surface salinity flux.
ANA_STFLUX Analytical kinematic surface temperature flux.
ANA_VMIX Analytical vertical mixing coefficients.
ASSUMED_SHAPE Using assumed-shape arrays.
AVERAGES Writing out time-averaged fields.
DIAGNOSTICS_TS Computing and writing tracer diagnostic terms.
DIAGNOSTICS_UV Computing and writing momentum diagnostic terms.
DJ_GRADPS Parabolic Splines density Jacobian (Shchepetkin, 2002).
DOUBLE_PRECISION Double precision arithmetic.
EW_PERIODIC East-West periodic boundaries.
MIX_S_TS Mixing of tracers along constant S-surfaces.
MIX_S_UV Mixing of momentum along constant S-surfaces.
MPI MPI distributed-memory configuration.
NONLINEAR Nonlinear Model.
!NONLIN_EOS Linear Equation of State for seawater.
POWER_LAW Power-law shape time-averaging barotropic filter.
PROFILE Time profiling activated .
!RST_SINGLE Double precision fields in restart NetCDF file.
SALINITY Using salinity.
SOLVE3D Solving 3D Primitive Equations.
SPLINES Conservative parabolic spline reconstruction.
TS_U3HADVECTION Third-order upstream bias horizontal advection of tracers.
TS_C4VADVECTION Fourth-order centered vertical advection of tracers.
TS_DIF2 Harmonic mixing of tracers.
UV_ADV Advection of momentum.
UV_COR Coriolis term.
UV_U3HADVECTION Third-order upstream bias advection of momentum.
UV_LDRAG Linear bottom stress.
UV_VIS2 Harmonic mixing of momentum.
VAR_RHO_2D Variable density barotropic mode.
0: ALLOCATE: 16721214976 bytes requested; not enough memory
p2_27782: p4_error: interrupt SIGSEGV: 11

ROMS/TOMS - Partition error ......... exit_flag: 6


Elapsed CPU time (seconds):

rm_l_1_27717: (0.410156) net_send: could not write to fd=6, errno = 9
p4_error: latest msg from perror: Bad file descriptor
rm_l_1_27717: p4_error: net_send write: -1
rm_l_2_27784: (0.210938) net_send: could not write to fd=5, errno = 32
p3_27849: p4_error: interrupt SIGx: 13
p3_27849: (6.031250) net_send: could not write to fd=5, errno = 32

User avatar
m.hadfield
Posts: 521
Joined: Tue Jul 01, 2003 4:12 am
Location: NIWA

Re: what's the error?

#2 Unread post by m.hadfield »

ROMS has already told you:
Wrong choice of domain 01 partition or number of parallel threads. NtileI * NtileJ must be equal to the number of parallel nodes.
Hint: NtileI and NtileJ are set in your input file.

mariafattorini
Posts: 52
Joined: Tue Mar 03, 2009 2:39 pm
Location: C.N.R. - LaMMA

Re: what's the error?

#3 Unread post by mariafattorini »

Hello,

I have a similar error message but I don't know where the error is.

When the application runs I get:

Code: Select all

 ...
         T  Hout(idBott)    Write out bottom property 18: dep_net

 Output/Input Files:

             Output Restart File:  ocean_rst.nc
             Output History File:  ocean_his.nc
            Output Averages File:  ocean_avg.nc

 READ_PHYPAR - could not find input file:  ocean_frc.nc

 Elapsed CPU time (seconds):

 Thread #  0 CPU:       0.062
 Total:                 0.062

 Nonlinear model elapsed time profile:

                                              Total:         0.000    0.0000

 All percentages are with respect to total time =            0.062

 ROMS/TOMS - Output NetCDF summary for Grid 01:

 ROMS/TOMS - I/O error ............... exit_flag:   4


 ERROR: I/O related problem.
In my log.out there is not any suggestion to found the error.
Can somebody help me to found the error, please?

Many thanks,
Maria

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: what's the error?

#4 Unread post by kate »

READ_PHYPAR - could not find input file: ocean_frc.nc
I think this is what you are looking for.

User avatar
drews
Posts: 35
Joined: Tue Jun 19, 2007 3:32 pm
Location: National Center for Atmospheric Research
Contact:

Re: what's the error?

#5 Unread post by drews »

This error message:

0: ALLOCATE: 16721214976 bytes requested; not enough memory

probably means that the ROMS executable has run out of memory, based on the grid size that you are using. 16 Gigabytes is a lot of space; I found a similar problem on my Linux workstation, when ROMS could not allocate about 138 Mb. You can try decreasing the array sizes in the ROMS input file (yourApp.in) to see if that fixes the problem temporarily, as I did here:

! Lm == 2399 ! Number of I-direction INTERIOR RHO-points
! Mm == 2399 ! Number of J-direction INTERIOR RHO-points
Lm == 1199 ! Number of I-direction INTERIOR RHO-points
Mm == 1199 ! Number of J-direction INTERIOR RHO-points
N == 1 ! Number of vertical levels

ROMS will likely produce an error later when the Lm and Mm numbers don't match the NetCDF grid file, but you can at least discover if you have a memory problem or something else. Use 'top' to monitor your memory usage. Changing the stack size to unlimited had no effect here.

If you are really running out of memory, you can install more, increase the size of your process space somehow, move to a bigger machine, or reduce your number of grid points. I'm reading about how to set up nested grids. :D

Post Reply