Parallel simulation of ROMS killed after using too much RAM

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
rgawde
Posts: 11
Joined: Sat Oct 10, 2015 1:04 am
Location: UMCES Horn Point Lab

Parallel simulation of ROMS killed after using too much RAM

#1 Unread post by rgawde »

Hello,

I am trying to run the ROMS model for the Choptank River system. It has been compiled using ifort and a test run for the year 2010 was set up. The simulation does not face any problems in serial mode but after executing it in parallel using the command line,

mpirun -np 8 ./oceanM choproms.in > myrun.log &

the model runs successfully upto about 24,000 timesteps. However, after this point, the run is simply killed giving the error:

24778 11051.86782 1.087116E-02 5.255789E+01 5.256876E+01 9.304978E+09 0
24779 11051.86794 1.088490E-02 5.255786E+01 5.256874E+01 9.304784E+09 0
24780 11051.86806 1.089875E-02 5.255783E+01 5.256873E+01 9.304592E+09 0
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 25824 on node cbeps2 exited on signal 9 (Killed).

This was accompanied by a message - "memory space (RAM) on server exceeded". The server actually supports 128GB of RAM with 24 processors. It seems to be a problem with processing in parallel mode but I'm not sure where to go with this. Could anyone please help with this error? Has anyone else experienced this before?

Thanks!

Post Reply