﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
552	IMPORTANT: OpenMP shared-memory directives Revisited	arango	arango	"This update includes a full revision of ROMS shared-memory pragma directives using '''OpenMP''' standard.  This is a very important and delicate update that requires expertise.  Luckly, I doubth that will affect you customized code.

All the parallel loops of ROMS are modified to simpler directives.  For example, the old strategy:
{{{
!$OMP PARALLEL DO PRIVATE(thread,subs,tile) SHARED(numthreads)
            DO thread=0,numthreads-1
              subs=NtileX(ng)*NtileE(ng)/numthreads
              DO tile=subs*thread,subs*(thread+1)-1,+1
                ...
              END DO
            END DO
!$OMP END PARALLEL DO
}}}
is replaced with:
{{{
            DO tile=first_tile(ng),last_tile(ng),+1
              ...
            END DO
!$OMP BARRIER
}}}
In shared-memory, the parallel threads are spawn at higher calling routines.  For example, we now have:
{{{
!$OMP PARALLEL
      CALL main3d (RunInterval)
#endif 
!$OMP END PARALLEL
}}}
This directive is less restrictive and allows '''MASTER''', '''BARRIER''', and other useful '''OpenMP''' pragmas inside the parallel region.  If you are interested, please see the following discussion in the [https://www.myroms.org/forum/viewtopic.php?f=19&t=2584 Forum].

This change cleans the code and facilitates parallelization of tricky algorithms for nesting, '''MPDATA''', random number generation, point-sources, etc using the shared-memory paradigm.

'''WARNINGS:'''

  * The values of '''NtileX(ng)''' and '''NtileE(ng)''' are no longer equal to '''one''' in distributed-memory ('''MPI''').  They have the same values as the one specified in standard input '''NtileI(ng)''' and '''NtileJ(ng)'''. Notice that in the critical regions for global reduction operatios we now use instead the following code:
{{{
#ifdef DISTRIBUTE
      NSUB=1                             ! distributed-memory
#else
      IF (DOMAIN(ng)%SouthWest_Corner(tile).and.                        &
     &    DOMAIN(ng)%NorthEast_Corner(tile)) THEN
        NSUB=1                           ! non-tiled application
      ELSE
        NSUB=NtileX(ng)*NtileE(ng)       ! tiled application
      END IF
#endif
}}}
  That is, we do a special exception for distribute-memory.  This change is necessary in your customized versions of '''ana_grid.h''' and '''ana_psource.h'''.

  * Notice that few important variables of ROMS in '''mod_scalars.F''' and '''mod_stepping.F''' use the '''THREADPRIVATE''' directive in shared-memory so all the parallel threads have a private copy of such variables to avoid parallel collisions.

  * Two new variables ('''first_tile(ng)''' and '''last_tile(ng)''') are introduced to specify the tile range in each parallel region:
{{{
      integer, allocatable :: first_tile(:)
      integer, allocatable :: last_tile(:)

!$OMP THREADPRIVATE (first_tile, last_tile)
}}}
  These variables are specified during the initialization of ROMS kernel using:
{{{
!$OMP PARALLEL
#if defined _OPENMP
      MyThread=my_threadnum()
#elif defined DISTRIBUTE
      MyThread=MyRank
#else
      MyThread=0
#endif
      DO ng=1,Ngrids
        chunk_size=(NtileX(ng)*NtileE(ng)+numthreads-1)/numthreads
        first_tile(ng)=MyThread*chunk_size
        last_tile (ng)=first_tile(ng)+chunk_size-1
      END DO
!$OMP END PARALLEL
}}}

Many thanks to Sasha shchepetkin for [https://www.myroms.org/forum/viewtopic.php?f=19&t=2584≈ suggesting] this strategy. Also many thanks to Mark Hadfield for his persistence and testing.
 "	upgrade	new	major	Release ROMS/TOMS 3.6	Nonlinear	3.6			
