If you really understand how parallel ROMS works, you can explain this situation:
I’ve got two serial runs, one with 1 tile, one with 4×1 tiles. I run ncdiff on the output files and after one timestep, the ice thickness looks like:
and the free surface looks like:
Next, I run the exact same code but compiled for MPI, and run 1×4 tiles vs. 4×1 tiles. The differences now look like:
and
I understand the first of these problems and await a formal fix. The second of these is in my court, but first I have to find out how to get totalview to pass the command line argument through to ROMS. Added fun is that the 1×4 case runs with -O2, blows up during the first timestep with -g.
Edit: OK, fixed the MPI bug – the usual stupid nonsense – I had the mp_exchange call, but not the #include “cppdefs.h” so that DISTRIBUTE would be #defined.
Next problems: There’s something wacky in our new bulk_flux option. This is a diff between runs with 1×1 vs. 4×1, both with the MPI executable:
I’ll have to investigate tomorrow, also an MPI issue with the LMD_BKPP option.