Ocean Modeling Discussion


Search for:
It is currently Fri Jul 19, 2019 1:46 am

Post new topic Reply to topic  [ 2 posts ] 

All times are UTC

Author Message
PostPosted: Thu Feb 08, 2007 8:27 pm 

Joined: Sun Jul 27, 2003 6:49 pm
Posts: 86
Location: UNH, USA
Hi all-

  I have experimented with running ROMS 2.2 on a iMac 2Ghz Intel Core Duo
  processor, with 667 MHz DDR2 RAM.  I used free compilers installed
  with fink.  As you will see, I suspect that I would do better to use
  the Intel compiler, but I have not done so since for large runs I
  use my linux cluster.

  As a quick conclusion, I find that MPI works best.  I suspect my
  results differ from Alexander Schepetkins because he used an intel
  compiler, which does a _much_ better job with openMP.

  My test model is a 349x219x20 grid, running with NDTFAST=20 for 20
  timesteps.  The relative size of the grid, and the size of NDTFAST,
  will affect these results, as they control the patterns of
  communication between threads and processes.

  I used both the g95 and gfortran compilers, and I parallelized the
  code with both openMP and MPI.   These results are by no means
  exhaustive, but should get you up and running.

  I installed most of the software with fink, from the unstable/main
  branch for os x 10.4.  It should not be hard to install your own
  software without fink, or to use the precompiled versions of these
  compilers from the HPC project.

  G95: (version 0.90, compiled with gcc 4.0.3).
        1) install g95 and netcdf with fink.
        2) make g95 the fortran compiler in the ROMS makefile
        3) cp External/Linux-g95.mk to External/Darwin-g95.mk
        4) change netcdf paths in Darwin-g95.mk to your appropriate
        location, likely
              NETCDF_INCDIR ?= /sw/include
              NETCDF_LIBDIR ?= /sw/lib
        5) set FFLAGS += -03 -ffast-math
        6) compile with "make -j2"  The -j2 launches 2 compile jobs at
        once, one for each core, and it is much faster.
        7) it will fail at mod_strings.f90.  You have two choices
                1) add quotes to the part of the code which should
                    character (len=80) :: my_os = 'Darwin'
                    character (len=80) :: my_cpu = '1'
                    character (len=80) :: my_fort = 'g95'
                    character (len=80) :: my_fc = 'g95'
                    character (len=160) :: my_fflags =' '
                2) fix the makefile so this does not happen, and does
                not break the linux compiles with the same makefile.  We will
                praise you with great praise!  I actually have not put
                much effort into it.
         8) run "make -j2" again

        It ran in 198.967 seconds.  Playing with the tileing does not help.

 GFORTRAN (part of GCC 4.2)

        1) install with fink.  "sudo fink install gcc4"
        2) change the compiler to gfortran in the makefile
        3) make netcdf for this compiler.
             a) snag the netcdf sources,
             b)  un tar, and run ./configure in the src directory
             c) change -Df2cFortran in macros.make to -DpgiFortran
             d) make
        4) cp Linux-gfortran.mk to Darwind-gfortran.mk  Set netcdf
        variables to point to the libraries you compiled in step 3.
        5) set FFLAGS += -O3 -ftree-vectorize -msse3
        6) do steps (6)-(8) of the G95 description above.

        it runs in 181 seconds.  This is not as small as I would
        expect useing the SSE3 optimizations.  I suspect this is
        because it fails to vectorize most of the loops.  If you set
        the verbose flag for vectorization, you will find that it has
        trouble with the vast majority of the loops.  I have not yet
        figured out how to fix this.   The intel compiler does much
        better on my Linux AMD-Opteron codes.   

        Playing with the tileing does not help much.

GFORTRAN with openmp.

        1) as above, but add -fopenmp to the FFLAGS, turn on openMP in
           the top level makefile and change the NtileI and NtileJ in
           the external file to take advantage of the multiple
           threads.  Set OMP_NUM_THREADS in the enviroment to use as
           many threads as you would like.  I used 2.

        fails -- can't handle the directive "OMP THREADPRIVATE
        (/process/)" in mod_parallel.F.  If I fix this with a hack, it
        runs slower than without openMP.  This is contrary to
        Alexander Schepetkin's results with ifort, so I am inclined to
        blame the compiler.


        1) as with GFORTRAN above but:
        2) use fink to instal openmpi
        3) change makefile to turn on MPI compiling.
        4) change "FC := gfortran" in Darwin-gfortran.mk to "FC :=
        mpif90" (this is not quite kosher, but since I sometimes run
        MPI with different compilers and libraries on the same
        machine, it is how I do it).
        5) change NtileI and NtileJ so that one is 1, and the other is 2
        experiment to see which works best.
        6) change your command to run the model to look something

                om-mpirun -np 2 ${PWD}/oceanM  ${PWD}/external/ocean_whatever.in

        It runs, on my machine, in 121 seconds.

I hope this helps someone.

Jamie Pringle

Reply with quote  
 Post subject:
PostPosted: Fri Feb 09, 2007 3:10 pm 
User avatar

Joined: Thu Jul 03, 2003 3:39 pm
Posts: 79
Location: TAMU,USA
If you hate fink as much as I do (I _really_ hate fink), you can also install gfortran very easily from the binary distribution found at


I have used these versions successfully for a few years now.


Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC

Who is online

Users browsing this forum: No registered users and 3 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group