ROMS on an Intel Mac with a Core Duo and free compilers

Discussion on computers, ROMS installation and compiling

Moderators: arango, robertson

Post Reply
Message
Author
jpringle
Posts: 86
Joined: Sun Jul 27, 2003 6:49 pm
Location: UNH, USA

ROMS on an Intel Mac with a Core Duo and free compilers

#1 Post by jpringle » Thu Feb 08, 2007 8:27 pm

Code: Select all

Hi all-

  I have experimented with running ROMS 2.2 on a iMac 2Ghz Intel Core Duo
  processor, with 667 MHz DDR2 RAM.  I used free compilers installed
  with fink.  As you will see, I suspect that I would do better to use
  the Intel compiler, but I have not done so since for large runs I
  use my linux cluster. 

  As a quick conclusion, I find that MPI works best.  I suspect my
  results differ from Alexander Schepetkins because he used an intel
  compiler, which does a _much_ better job with openMP. 

  My test model is a 349x219x20 grid, running with NDTFAST=20 for 20
  timesteps.  The relative size of the grid, and the size of NDTFAST,
  will affect these results, as they control the patterns of
  communication between threads and processes. 

  I used both the g95 and gfortran compilers, and I parallelized the
  code with both openMP and MPI.   These results are by no means
  exhaustive, but should get you up and running. 

  I installed most of the software with fink, from the unstable/main
  branch for os x 10.4.  It should not be hard to install your own
  software without fink, or to use the precompiled versions of these
  compilers from the HPC project.

  G95: (version 0.90, compiled with gcc 4.0.3). 
  
        1) install g95 and netcdf with fink.
        2) make g95 the fortran compiler in the ROMS makefile
        3) cp External/Linux-g95.mk to External/Darwin-g95.mk
        4) change netcdf paths in Darwin-g95.mk to your appropriate
        location, likely 
              NETCDF_INCDIR ?= /sw/include
              NETCDF_LIBDIR ?= /sw/lib
        5) set FFLAGS += -03 -ffast-math
        6) compile with "make -j2"  The -j2 launches 2 compile jobs at
        once, one for each core, and it is much faster. 
        7) it will fail at mod_strings.f90.  You have two choices
                1) add quotes to the part of the code which should
                read:
                    character (len=80) :: my_os = 'Darwin'
                    character (len=80) :: my_cpu = '1'
                    character (len=80) :: my_fort = 'g95'
                    character (len=80) :: my_fc = 'g95'
                    character (len=160) :: my_fflags =' ' 
                2) fix the makefile so this does not happen, and does
                not break the linux compiles with the same makefile.  We will
                praise you with great praise!  I actually have not put
                much effort into it.
         8) run "make -j2" again

        It ran in 198.967 seconds.  Playing with the tileing does not help.

 GFORTRAN (part of GCC 4.2)

        1) install with fink.  "sudo fink install gcc4"
        2) change the compiler to gfortran in the makefile
        3) make netcdf for this compiler.
             a) snag the netcdf sources,
             b)  un tar, and run ./configure in the src directory
             c) change -Df2cFortran in macros.make to -DpgiFortran
             d) make 
        4) cp Linux-gfortran.mk to Darwind-gfortran.mk  Set netcdf
        variables to point to the libraries you compiled in step 3.
        5) set FFLAGS += -O3 -ftree-vectorize -msse3 
        6) do steps (6)-(8) of the G95 description above. 

        it runs in 181 seconds.  This is not as small as I would
        expect useing the SSE3 optimizations.  I suspect this is
        because it fails to vectorize most of the loops.  If you set
        the verbose flag for vectorization, you will find that it has
        trouble with the vast majority of the loops.  I have not yet
        figured out how to fix this.   The intel compiler does much
        better on my Linux AMD-Opteron codes.   

        Playing with the tileing does not help much. 

GFORTRAN with openmp.

        1) as above, but add -fopenmp to the FFLAGS, turn on openMP in
           the top level makefile and change the NtileI and NtileJ in
           the external file to take advantage of the multiple
           threads.  Set OMP_NUM_THREADS in the enviroment to use as
           many threads as you would like.  I used 2.

        fails -- can't handle the directive "OMP THREADPRIVATE
        (/process/)" in mod_parallel.F.  If I fix this with a hack, it
        runs slower than without openMP.  This is contrary to
        Alexander Schepetkin's results with ifort, so I am inclined to
        blame the compiler. 

GFORTRAN with MPI

        1) as with GFORTRAN above but:
        2) use fink to instal openmpi
        3) change makefile to turn on MPI compiling.
        4) change "FC := gfortran" in Darwin-gfortran.mk to "FC :=
        mpif90" (this is not quite kosher, but since I sometimes run
        MPI with different compilers and libraries on the same
        machine, it is how I do it). 
        5) change NtileI and NtileJ so that one is 1, and the other is 2
        experiment to see which works best.
        6) change your command to run the model to look something
        like:

                om-mpirun -np 2 ${PWD}/oceanM  ${PWD}/external/ocean_whatever.in

        It runs, on my machine, in 121 seconds. 

I hope this helps someone. 

Jamie Pringle
 

User avatar
hetland
Posts: 79
Joined: Thu Jul 03, 2003 3:39 pm
Location: TAMU,USA

#2 Post by hetland » Fri Feb 09, 2007 3:10 pm

If you hate fink as much as I do (I _really_ hate fink), you can also install gfortran very easily from the binary distribution found at

http://hpc.sourceforge.net

I have used these versions successfully for a few years now.

-r

Post Reply