Ocean Modeling Discussion


Search for:
It is currently Tue Jul 23, 2019 7:49 am

Post new topic Reply to topic  [ 2 posts ] 

All times are UTC

Author Message
PostPosted: Thu May 24, 2018 8:46 am 
Hi, everyone. I am a beginner to ROMS, and I am trying to run ROMS on supercomputers. After solving a lot of trouble, I got to the last step of running ROMS,but a new error occurred.
The supercomputers have installed module tool to manage softwares and use Slurm job management system with node exclusive mode, and every node has 20 cores.
The software I have loaded are as follows:
module load hdf5/intel18/1.8.20-parallel
module load intel/18.0.2
module load mpi/intel/18.0.2
module load mpi/openmpi/3.0.1-pmi-icc18
module load netcdf/intel18/4.4.1-parallel

I have also modified the build.bash as follows:
export  MY_ROOT_DIR=${HOME}/roms
export  MY_PROJECT_DIR=${MY_ROOT_DIR}/Projects/Upwelling
export  PATH=/public1/soft/openmpi/3.0.1-pmi-icc/bin:$PATH
export  USE_MY_LIBS=on
export  NF_CONFIG=/public1/soft/netcdf/4.4.1-parallel-icc18/bin/nf-config
export  NETCDF_INCDIR=/public1/soft/netcdf/4.4.1-parallel-icc18/include
export  NETCDF_LIBDIR=/public1/soft/netcdf/4.4.1-parallel-icc18/lib

In order to avoid some errors, I have also modified the Compilers/Linux-ifort.mk as follows:
LIBS := -L$(NETCDF_LIBDIR) -lnetcdff -lnetcdf
#FFLAGS += -Wl,-stack_size,0x64000000
#FFLAGS += -Wl,-stack_size,0x64000000

The last step is to run ROMS,and here comes the trouble. The ROMS wiki website tells me to type "mpirun -np 40 oceanM ocean_upwelling.in" to to run in parallel (distributed-memory) on 40 processors, while in supercomputers I have to use Slurm. I need to write a script job.sh, and the content is as follows:
#SBATCH -n 40
srun -n 40 oceanM ocean_upwelling.in

The I need to type "sbatch -p paratera job.sh" to run ROMS. paratera is the queue name,for example pg2_64_pool. Another point to mention is that I need to modify the ocean_upwelling.in to make NtileI*NtileJ equal to 40. The job ran for a short time and in the output log file slurm-xxx.out an error still occurred and it can be shown as follows:


I want to know the cause of the error and correct it. Hope to get some advice.

Reply with quote  
PostPosted: Thu May 24, 2018 3:18 pm 
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3633
Location: IMS/UAF, USA
I don't know what your error is. I assume you have talked to your local supercomputer people? We too use slurm and here is a job script:

#SBATCH -t 144:00:00
#SBATCH --ntasks=192
#SBATCH --job-name=ARCTIC4
#SBATCH --tasks-per-node=24
#SBATCH -p t2standard
#SBATCH --account=akwaters
#SBATCH --output=ARCTIC4.%j
#SBATCH --no-requeue

. /usr/share/Modules/init/bash
module purge
module load slurm
module load toolchain/pic-iompi/2016b
module load numlib/imkl/
module load toolchain/pic-intel/2016b
module load compiler/icc/2016.3.210-GCC-5.4.0-2.26
module load compiler/ifort/2016.3.210-GCC-5.4.0-2.26
module load openmpi/intel/1.10.4
module load data/netCDF-Fortran/4.4.4-pic-intel-2016b
module list

#  Prolog
echo " "
echo "++++ Chinook ++++ $PGM_NAME began:    `date`"
echo "++++ Chinook ++++ $PGM_NAME hostname: `hostname`"
echo "++++ Chinook ++++ $PGM_NAME uname -a: `uname -a`"
echo " "
TBEGIN=`echo "print time();" | perl`

srun -l /bin/hostname | sort -n | awk '{print $2}' > ./nodes
mpirun -np $SLURM_NTASKS -machinefile ./nodes --mca mpi_paffinity_alone 1 ./oceanM ocean_arctic4.in

#  Epilog
TEND=`echo "print time();" | perl`
echo " "
echo "++++ Chinook ++++ $PGM_NAME pwd:      `pwd`"
echo "++++ Chinook ++++ $PGM_NAME ended:    `date`"
echo "++++ Chinook ++++ $PGM_NAME walltime: `expr $TEND - $TBEGIN` seconds"

Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC

Who is online

Users browsing this forum: No registered users and 5 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group