Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#552 closed upgrade (Done)

IMPORTANT: OpenMP shared-memory directives Revisited

Reported by: arango Owned by: arango
Priority: major Milestone: Release ROMS/TOMS 3.6
Component: Nonlinear Version: 3.6
Keywords: Cc:

Description (last modified by arango)

This update includes a full revision of ROMS shared-memory pragma directives using OpenMP standard. This is a very important and delicate update that requires expertise. Luckly, I doubth that will affect your customized code.

All the parallel loops of ROMS are modified to simpler directives. For example, the old strategy:

!$OMP PARALLEL DO PRIVATE(thread,subs,tile) SHARED(numthreads)
            DO thread=0,numthreads-1
              DO tile=subs*thread,subs*(thread+1)-1,+1
              END DO
            END DO

is replaced with:

            DO tile=first_tile(ng),last_tile(ng),+1
            END DO

In shared-memory, the parallel threads are spawn in the higher calling routines (drivers). For example, we now have:

      CALL main3d (RunInterval)

This directive is less restrictive and allows MASTER, BARRIER, and other useful OpenMP pragmas inside the parallel regions. If you are interested, please see the following discussion in the Forum.

This change cleans the code and facilitates parallelization of tricky algorithms for nesting, MPDATA, random number generation, point-sources, etc using the shared-memory paradigm.


  • The values of NtileX(ng) and NtileE(ng) are no longer equal to one in distributed-memory (MPI). They have the same values as the ones specified in standard input: NtileI(ng) and NtileJ(ng). Notice that in the critical regions for global reduction operations, we now use the following code instead:
    #ifdef DISTRIBUTE
          NSUB=1                             ! distributed-memory
          IF (DOMAIN(ng)%SouthWest_Corner(tile).and.                        &
         &    DOMAIN(ng)%NorthEast_Corner(tile)) THEN
            NSUB=1                           ! non-tiled application
            NSUB=NtileX(ng)*NtileE(ng)       ! tiled application
          END IF
    That is, we do a special exception for distribute-memory cases. This change is necessary in your customized versions of ana_grid.h and ana_psource.h.
  • Notice that we no longer use the TILE (uppercase) as argument to the kernel routines. We use tile (lowercase) instead. This was important in previous versions of the distributed-memory code where TILE was replaced with MyRank during C-preprocessing. Be careful with this one...
  • Notice that few important variables in mod_scalars.F and mod_stepping.F use the THREADPRIVATE directive in shared-memory applications so all the parallel threads have a private copy of such variables to avoid parallel collisions.
  • Two new variables (first_tile(ng) and last_tile(ng)) are introduced to specify the tile range in each parallel region:
          integer, allocatable :: first_tile(:)
          integer, allocatable :: last_tile(:)
    !$OMP THREADPRIVATE (first_tile, last_tile)
    These variables are set during the initialization of ROMS kernel using:
    #if defined _OPENMP
    #elif defined DISTRIBUTE
          DO ng=1,Ngrids
            last_tile (ng)=first_tile(ng)+chunk_size-1
          END DO

Many thanks to Sasha shchepetkin for suggesting this strategy. Also many thanks to Mark Hadfield for his persistence and testing.

Change History (4)

comment:1 Changed 9 years ago by arango

  • Resolution set to Done
  • Status changed from new to closed

comment:2 Changed 9 years ago by arango

  • Description modified (diff)

comment:3 Changed 9 years ago by arango

  • Description modified (diff)

comment:4 Changed 9 years ago by arango

  • Description modified (diff)
Note: See TracTickets for help on using tickets.