Ocean Modeling Discussion

ROMS/TOMS

Search for:
It is currently Sat Nov 18, 2017 1:39 pm




Post new topic Reply to topic  [ 12 posts ] 

All times are UTC

Author Message
 Post subject: ROMS runtime error
PostPosted: Wed Nov 08, 2017 9:24 am 
Offline

Joined: Mon Jan 27, 2014 9:50 pm
Posts: 20
Location: Indian Institute of Science
Hi ROMS users

I have been trying to run a roms application, but after submitting my job (parallel) to the cluster, it exits from the queue after running for 1 second. I checked the log file and it displays the following message

=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 15
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)

I checked with my cluster admin and there is nothing wrong with the job submit script. Please help.


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Wed Nov 08, 2017 4:25 pm 
Offline
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3251
Location: IMS/UAF, USA
What does the ROMS output look like? Did you get any?


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Wed Nov 08, 2017 5:20 pm 
Offline

Joined: Mon Jan 27, 2014 9:50 pm
Posts: 20
Location: Indian Institute of Science
I didn't get any output. The log file is supposed to show something like this:
--------------------------------------------------------------------------------
Model Input Parameters: ROMS/TOMS version 3.7
Wednesday - November 8, 2017 - 2:50:07 PM
--------------------------------------------------------------------------------


Instead, all I get is the message
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 15
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)

Cluster admin says job submission script is submitting the job successfully. The problem is with running the model.


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Wed Nov 08, 2017 6:06 pm 
Offline
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3251
Location: IMS/UAF, USA
ROMS is normally quite verbose about what went wrong with it. It starts writing to stdout very early in the job. You're saying it gets killed even before that, there's no ROMS output at all. Very odd.


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Thu Nov 09, 2017 4:21 am 
Offline

Joined: Mon Jan 27, 2014 9:50 pm
Posts: 20
Location: Indian Institute of Science
Yes Kate. That's exactly what's happening. Please help.


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Thu Nov 09, 2017 6:57 pm 
Offline
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3251
Location: IMS/UAF, USA
Did it used to run? Are you sure you have the right syntax on the ROMS execute line in your script? Can we see that?


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Fri Nov 10, 2017 4:52 am 
Offline

Joined: Mon Jan 27, 2014 9:50 pm
Posts: 20
Location: Indian Institute of Science
Please find my job submit submit script in the attachment. It used to run, albeit with a little difference in execute line.


Attachments:
submit_new.sh [486 Bytes]
Downloaded 7 times
Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Fri Nov 10, 2017 5:52 am 
Offline
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3251
Location: IMS/UAF, USA
Try taking out the blank line #2. I don't know if you're allowed to have a blank line there. Some queueing systems end the batch commands on the first blank line.


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Sat Nov 11, 2017 6:43 am 
Offline

Joined: Mon Jan 27, 2014 9:50 pm
Posts: 20
Location: Indian Institute of Science
Removed the blank lines. Still the same log. Did a tracejob on jobid, the output is as follows:

[casparga@tyrone-cluster upwelling]$ tracejob 78891
/var/spool/torque/server_priv/accounting/20171111: Permission denied

Job: 78891.tyrone-cluster

11/11/2017 12:29:47 S enqueuing into batch, state 1 hop 1
11/11/2017 12:29:47 S dequeuing from batch, state QUEUED
11/11/2017 12:29:47 S enqueuing into idqueue, state 1 hop 1
11/11/2017 12:29:47 S Job Queued at request of casparga@tyrone-cluster, owner = casparga@tyrone-cluster, job name = UPWELLING, queue = idqueue
11/11/2017 12:29:47 S Job Modified at request of Scheduler@tyrone-cluster
11/11/2017 12:29:47 S Exit_status=2 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.vmem=0kb resources_used.walltime=00:00:00
11/11/2017 12:29:47 L Job Run
11/11/2017 12:29:47 S Job Run at request of Scheduler@tyrone-cluster
11/11/2017 12:29:47 S Not sending email: User does not want mail of this type.
11/11/2017 12:29:47 S Not sending email: User does not want mail of this type.
11/11/2017 12:29:47 S dequeuing from idqueue, state COMPLETE
11/11/2017 12:29:47 M scan_for_terminated: job 78891.tyrone-cluster task 1 terminated, sid=7807
11/11/2017 12:29:47 M job was terminated
11/11/2017 12:29:47 M obit sent to server
11/11/2017 12:29:47 M removed job script


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Sat Nov 11, 2017 7:21 am 
Offline
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3251
Location: IMS/UAF, USA
Quote:
/var/spool/torque/server_priv/accounting/20171111: Permission denied

Have you talked to your supercomputer people? I don't think this is anything to do with ROMS. Maybe you should let it send you email if it can be more verbose.


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Tue Nov 14, 2017 9:44 am 
Offline

Joined: Mon Jan 27, 2014 9:50 pm
Posts: 20
Location: Indian Institute of Science
Sorry for late reply but I communicated my cluster admin that there is no problem with the model. After resolving an issue of password less login, I tried to run the model and got the following as output in the log file.

[mpiexec@tyrone-node16] HYD_pmcd_pmiserv_send_signal (./pm/pmiserv/pmiserv_cb.c:184): assert (!closed) failed
[mpiexec@tyrone-node16] ui_cmd_cb (./pm/pmiserv/pmiserv_pmci.c:74): unable to send SIGUSR1 downstream
[mpiexec@tyrone-node16] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@tyrone-node16] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
[mpiexec@tyrone-node16] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion


Is there a problem with the compilation of the model?


Top
 Profile  
Reply with quote  
 Post subject: Re: ROMS runtime error
PostPosted: Tue Nov 14, 2017 5:50 pm 
Offline
User avatar

Joined: Wed Jul 02, 2003 5:29 pm
Posts: 3251
Location: IMS/UAF, USA
You can search the web for answers to things like this. Here's one match which might be useful.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC


Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group