﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
552	IMPORTANT: OpenMP shared-memory directives Revisited	arango	arango	"This update includes a full revision of ROMS shared-memory pragma directives using '''OpenMP''' standard.  This is a very important and delicate update that requires expertise.  Luckly, I doubth that will affect your customized code.

All the parallel loops of ROMS are modified to simpler directives.  For example, the old strategy:
{{{
!$OMP PARALLEL DO PRIVATE(thread,subs,tile) SHARED(numthreads)
            DO thread=0,numthreads-1
              subs=NtileX(ng)*NtileE(ng)/numthreads
              DO tile=subs*thread,subs*(thread+1)-1,+1
                ...
              END DO
            END DO
!$OMP END PARALLEL DO
}}}
is replaced with:
{{{
            DO tile=first_tile(ng),last_tile(ng),+1
              ...
            END DO
!$OMP BARRIER
}}}
In shared-memory, the parallel threads are spawn in the higher calling routines (drivers).  For example, we now have:
{{{
!$OMP PARALLEL
      CALL main3d (RunInterval)
!$OMP END PARALLEL
}}}
This directive is less restrictive and allows '''MASTER''', '''BARRIER''', and other useful '''OpenMP''' pragmas inside the parallel regions.  If you are interested, please see the following discussion in the [https://www.myroms.org/forum/viewtopic.php?f=19&t=2584 Forum].

This change cleans the code and facilitates parallelization of tricky algorithms for nesting, '''MPDATA''', random number generation, point-sources, etc using the shared-memory paradigm.

'''WARNINGS:'''

  * The values of '''NtileX(ng)''' and '''NtileE(ng)''' are no longer equal to '''one''' in distributed-memory ('''MPI''').  They have the same values as the ones specified in standard input: '''NtileI(ng)''' and '''NtileJ(ng)'''. Notice that in the critical regions for global reduction operations, we now use the following code instead:
{{{
#ifdef DISTRIBUTE
      NSUB=1                             ! distributed-memory
#else
      IF (DOMAIN(ng)%SouthWest_Corner(tile).and.                        &
     &    DOMAIN(ng)%NorthEast_Corner(tile)) THEN
        NSUB=1                           ! non-tiled application
      ELSE
        NSUB=NtileX(ng)*NtileE(ng)       ! tiled application
      END IF
#endif
}}}
  That is, we do a special exception for distribute-memory cases.  This change is necessary in your customized versions of '''ana_grid.h''' and '''ana_psource.h'''.

  * Notice that we no longer use the '''TILE''' (uppercase) as argument to the kernel routines.  We use '''tile''' (lowercase) instead. This was important in previous versions of the distributed-memory code where '''TILE''' was replaced with '''!MyRank''' during C-preprocessing.  Be careful with this one...

  * Notice that few important variables in '''mod_scalars.F''' and '''mod_stepping.F''' use the '''THREADPRIVATE''' directive in shared-memory applications so all the parallel threads have a private copy of such variables to avoid parallel collisions.

  * Two new variables ('''first_tile(ng)''' and '''last_tile(ng)''') are introduced to specify the tile range in each parallel region:
{{{
      integer, allocatable :: first_tile(:)
      integer, allocatable :: last_tile(:)

!$OMP THREADPRIVATE (first_tile, last_tile)
}}}
  These variables are set during the initialization of ROMS kernel using:
{{{
!$OMP PARALLEL
#if defined _OPENMP
      MyThread=my_threadnum()
#elif defined DISTRIBUTE
      MyThread=MyRank
#else
      MyThread=0
#endif
      DO ng=1,Ngrids
        chunk_size=(NtileX(ng)*NtileE(ng)+numthreads-1)/numthreads
        first_tile(ng)=MyThread*chunk_size
        last_tile (ng)=first_tile(ng)+chunk_size-1
      END DO
!$OMP END PARALLEL
}}}

Many thanks to Sasha shchepetkin for [https://www.myroms.org/forum/viewtopic.php?f=19&t=2584≈ suggesting] this strategy. Also many thanks to Mark Hadfield for his persistence and testing.
 "	upgrade	closed	major	Release ROMS/TOMS 3.6	Nonlinear	3.6	Done		
