ROMS code modernization

Message

gnayar · #1 Unread post by **gnayar** » Fri Jun 19, 2015 5:30 pm

Hi,
We have been working on several optimizations to the ROMS code base primarily for performance. The changes that were done to make use of some of the newer MPI-3 interfaces like neighborhood collectives have also made the code much cleaner.

The changes in no particular order are:

Alighment and Padding to make efficient use of the vector units. These changes are mostly in the functions that were the hotspots in the benchmark application - step2d, lmd_skpp, step3d_uv, rhs3d and pre_step3d.
Loop transformations for improving cache performance.
Cartesian topology: Use of cartesian communicator for neighbor exchanges in mp_exchanges
Derived Data Types: Use of MPI derived data types help to avoid explicit packing/unpacking in mp_exchange code and makes the code more efficient and simpler.
Neighborhood Collectives: Use of MPI-3 neighborhood collectives replaces the four sends and recvs in mp_exchange with one all_to_all call making the code simpler and more efficient.

I would like to know what is the best way for making these changes widely available to the community.

-gopal

jivica · #2 Unread post by **jivica** » Mon Jun 22, 2015 4:38 am

Should be easy if you are using git.
What speedup you got when compared to "standard" version?
I am interested into making benchmark for my application on my cluster (using intel compilers + openmpi 1.8.6)

Cheers,
Ivica

Ocean Modeling Discussion

ROMS code modernization

ROMS code modernization

Re: ROMS code modernization