Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#552 closed upgrade (Done)

IMPORTANT: OpenMP shared-memory directives Revisited

Reported by: arango Owned by: arango
Priority: major Milestone: Release ROMS/TOMS 3.6
Component: Nonlinear Version: 3.6
Keywords: Cc:

Description (last modified by arango)

This update includes a full revision of ROMS shared-memory pragma directives using OpenMP standard. This is a very important and delicate update that requires expertise. Luckly, I doubth that will affect your customized code.

All the parallel loops of ROMS are modified to simpler directives. For example, the old strategy:

!$OMP PARALLEL DO PRIVATE(thread,subs,tile) SHARED(numthreads)
            DO thread=0,numthreads-1
              subs=NtileX(ng)*NtileE(ng)/numthreads
              DO tile=subs*thread,subs*(thread+1)-1,+1
                ...
              END DO
            END DO
!$OMP END PARALLEL DO

is replaced with:

            DO tile=first_tile(ng),last_tile(ng),+1
              ...
            END DO
!$OMP BARRIER

In shared-memory, the parallel threads are spawn in the higher calling routines (drivers). For example, we now have:

!$OMP PARALLEL
      CALL main3d (RunInterval)
!$OMP END PARALLEL

This directive is less restrictive and allows MASTER, BARRIER, and other useful OpenMP pragmas inside the parallel regions. If you are interested, please see the following discussion in the Forum.

This change cleans the code and facilitates parallelization of tricky algorithms for nesting, MPDATA, random number generation, point-sources, etc using the shared-memory paradigm.

WARNINGS:

  • The values of NtileX(ng) and NtileE(ng) are no longer equal to one in distributed-memory (MPI). They have the same values as the ones specified in standard input: NtileI(ng) and NtileJ(ng). Notice that in the critical regions for global reduction operations, we now use the following code instead:
    #ifdef DISTRIBUTE
          NSUB=1                             ! distributed-memory
    #else
          IF (DOMAIN(ng)%SouthWest_Corner(tile).and.                        &
         &    DOMAIN(ng)%NorthEast_Corner(tile)) THEN
            NSUB=1                           ! non-tiled application
          ELSE
            NSUB=NtileX(ng)*NtileE(ng)       ! tiled application
          END IF
    #endif
    
    That is, we do a special exception for distribute-memory cases. This change is necessary in your customized versions of ana_grid.h and ana_psource.h.
  • Notice that we no longer use the TILE (uppercase) as argument to the kernel routines. We use tile (lowercase) instead. This was important in previous versions of the distributed-memory code where TILE was replaced with MyRank during C-preprocessing. Be careful with this one...
  • Notice that few important variables in mod_scalars.F and mod_stepping.F use the THREADPRIVATE directive in shared-memory applications so all the parallel threads have a private copy of such variables to avoid parallel collisions.
  • Two new variables (first_tile(ng) and last_tile(ng)) are introduced to specify the tile range in each parallel region:
          integer, allocatable :: first_tile(:)
          integer, allocatable :: last_tile(:)
    
    !$OMP THREADPRIVATE (first_tile, last_tile)
    
    These variables are set during the initialization of ROMS kernel using:
    !$OMP PARALLEL
    #if defined _OPENMP
          MyThread=my_threadnum()
    #elif defined DISTRIBUTE
          MyThread=MyRank
    #else
          MyThread=0
    #endif
          DO ng=1,Ngrids
            chunk_size=(NtileX(ng)*NtileE(ng)+numthreads-1)/numthreads
            first_tile(ng)=MyThread*chunk_size
            last_tile (ng)=first_tile(ng)+chunk_size-1
          END DO
    !$OMP END PARALLEL
    

Many thanks to Sasha shchepetkin for suggesting this strategy. Also many thanks to Mark Hadfield for his persistence and testing.

Change History (4)

comment:1 by arango, 12 years ago

Resolution: Done
Status: newclosed

comment:2 by arango, 12 years ago

Description: modified (diff)

comment:3 by arango, 12 years ago

Description: modified (diff)

comment:4 by arango, 12 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.