ROMS/TOMS Developers

kate - June 17, 2009 @ 17:14 Binary trees- Comments (0)

We have a project with some novel challenges and I decided that one thing it needs is a balanced binary tree, something I learned about in a data structures class. It meant delving into some new-to-me sides of the Fortran standard, but Fortran 90 supports pointers which should be able to do the job. Because I’m not that familiar with Fortran pointers, I started building my tree with a short test program just to make sure I’m on the right track before including this stuff into the full ROMS code.

I started by creating a treenode object with pointers to the left and right children, plus the parent. I used them to insert objects into a binary tree, no problem. Then I started adding the code so that the tree would be balanced after each insertion, using the standard red-black algorithm that’s in any algorithms textbook. I repeat, this is meat-and-potatoes homework assignment level of algorithm, not something exotic at all. Here’s my f90 version:

      MODULE mod_tree
!
!================================================== Kate Hedstrom ======
!  Copyright (c) 2002-2009 The ROMS/TOMS Group                         !
!    Licensed under a MIT/X style license                              !
!    See License_ROMS.txt                                              !
!=======================================================================
!  Set up tree structure and functions.                                !
!=======================================================================
!
        implicit none

        type treenode
          type(treenode), pointer :: left => null()
          type(treenode), pointer :: right => null()
          type(treenode), pointer :: parent => null()
          logical :: red = .FALSE.
          integer :: eggs = 0
          real*8  :: dist
        end type treenode

        type(treenode), pointer :: tree

        PRIVATE
        PUBLIC :: init, insert

        CONTAINS

        SUBROUTINE init
          ALLOCATE(tree)
        END SUBROUTINE init

        SUBROUTINE insert(eggs, dist)
          integer, intent(in) :: eggs
          real*8, intent(in) :: dist
          type(treenode), pointer :: cur, p, x, y

          ALLOCATE(cur)
          cur % eggs = eggs
          cur % dist = dist

! Empty tree, deposit eggs at the top
          IF (.not. ASSOCIATED(tree % left)) THEN
            tree % left => cur
            cur % parent => tree
            RETURN
          END IF

! Otherwise find somewhere to put these eggs
! New nodes end up at the bottom until a rebalance
          p => tree % left

          DO
            IF (dist <= p % dist) THEN   
              IF (ASSOCIATED(p % left)) THEN
                p => p % left
                CYCLE
              ELSE
                p % left => cur
                cur % parent => p
                EXIT
              END IF
            ELSE
              IF (ASSOCIATED(p % right)) THEN
                p => p % right
                CYCLE
              ELSE
                p % right => cur
                cur % parent => p
                EXIT
              END IF
            END IF
          END DO
! Balance the thing... red-black for now, until I get smarter about
! balancing eggs.
          cur % red = .true.
          x => cur
          DO WHILE (x % parent % red)
            IF (ASSOCIATED(x % parent % parent % left, x % parent)) THEN
              IF (ASSOCIATED(x % parent % parent % right)) THEN
                y => x % parent % parent % right   ! uncle
                IF (y % red) THEN
                  x % parent % red = .false.
                  y % red = .false.
                  x % parent % parent % red = .true.
                  x => x % parent % parent
                END IF
              ELSE
                IF (ASSOCIATED(x, x % parent % right)) THEN
                  CALL rotate_left(x % parent)
                END IF
                x % parent % red = .false.
                x % parent % parent % red = .true.
                CALL rotate_right(x % parent % parent)
              END IF
            ELSE
! Must be right grandchild to get here
              IF (ASSOCIATED(x % parent % parent % left)) THEN
                y => x % parent % parent % left    ! aunt
                IF (y % red) THEN
                  x % parent % red = .false.
                  y % red = .false.
                  x % parent % parent % red = .true.
                  x => x % parent % parent
                END IF
              ELSE
                IF (ASSOCIATED(x, x % parent % left)) THEN
                  CALL rotate_right(x % parent)
                END IF
                x % parent % red = .false.
                x % parent % parent % red = .true.
                CALL rotate_left(x % parent % parent)
              END IF
            END IF
          END DO
          tree % left % red = .false.
        END SUBROUTINE insert
!
! For the rotating, I'm working from C code I found online for red-black trees,
! with reference to Introduction to Algorithms by Cormen, Leiserson,
! Rivest (Chapter 14). It makes right child of x into the parent of x.
!
        SUBROUTINE rotate_left(x)
          type(treenode), pointer :: x, y
          integer :: mine, theirs

          y => x % right

          IF (ASSOCIATED(y % left)) THEN
            x % right => y % left
            x % right % parent => x
          ELSE
            NULLIFY(x % right)
          END IF

          print *, "rotate_left before ", x % dist, x % eggs
          y % parent => x % parent
          print *, "rotate_left after ", x % dist, x % eggs

          IF (ASSOCIATED(x, x % parent % left)) THEN
            x % parent % left => y
          ELSE
            x % parent % right => y
          END IF
          y % left => x
          x % parent => y
        END SUBROUTINE rotate_left

        SUBROUTINE rotate_right(x)
          type(treenode), pointer :: x, y

          y => x % left

          IF (ASSOCIATED(y % right)) THEN
            x % left => y % right
            x % left % parent => x
          ELSE
            NULLIFY(x % left)
          END IF

          print *, "rotate_right before ", x % dist, x % eggs
          y % parent => x % parent
          print *, "rotate_right after ", x % dist, x % eggs

          IF (ASSOCIATED(x, x % parent % left)) THEN
            x % parent % left => y
          ELSE
            x % parent % right => y
          END IF
          y % right => x
          x % parent => y
        END SUBROUTINE rotate_right

      END MODULE mod_tree
      program main

      use mod_tree
      implicit none

        integer, parameter :: num = 3
        integer :: caviar(num)
        real*8  :: dist(num)
        integer :: i

      dist = (/ 21, 56, 78 /)
      caviar = 100*dist

       call init

       do i=1,num
          call insert(caviar(i), dist(i))
       end do

      end program main

I defy you to find a compiler which does the right thing for this code (or to find a bug in this code to explain this). I’ve tried gfortran, ifort, PGI, and Pathscale. A typical response:

rotate_left before 21.0000000000000 2100
rotate_left after 0.000000000000000E+000 0
forrtl: severe (174): SIGSEGV, segmentation fault occurred

kate - May 20, 2009 @ 18:00 Introduction to Git- Comments (1)

This is a little something I cooked up for the ARSC HPC newsletter which I thought might be of interest to the adventuresome folks out in ROMS land:

A colleague and I used to argue the merits of RCS vs. SCCS for
maintaining source code. We then migrated to CVS then SVN and currently
have an SVN site at Rutgers. Each transition was a step forward for the
better. But what about the future? Does SVN have any shortcomings that
have been bugging you?

I’ll tell you about one that’s been bugging me. My colleague and I
both have write permission on our SVN server, him on the trunk, me
on a branch. However, most of the people downloading our code see it
as a read-only site. This is great for software (such as svn) in which
the average user is not expected to change anything or contribute
patches. However, we are dealing with a complex ocean model in which
the average user will at least be changing a few files for setting cpp
choices. The above average user might be changing quite a few files,
adding new capabilities, etc. If these people want to save their own
revision history, they can set up a private svn repository, but any
given sandbox directory can only have one parent repository – either
their own or the Rutgers one. If they aren’t saving their own
changes, they are subject to all kinds of unsafe surprises whenever
they do “svn update” from us. Even I keep two directories going, one
pointing to the trunk and the other pointing to my branch.

So how to solve this problem? Linus Torvalds got so fed-up with
his options for developing the Linux kernel that he wrote a new
distributed version control system, called Git. It is available from
http://git-scm.com/, including a documentation page pointing to all
sorts of available resources. There’s even a YouTube video of Linus
himself ranting about these issues. The various tutorials listed are
perhaps more useful, plus I am enjoying “Pragmatic Version Control Using
Git” by Travis Swicegood. For the O’Reilly fans out there, a new bat
book is in the works, called “Version Control with Git” by Jon Loeliger.

How does Git solve your problems, you ask? Both CVS and SVN have
centralized repositories from which one or many people can check out
the code. Git has a new-to-me concept of distributed repositories,
something I’m sure I still don’t entirely comprehend. The repository
itself is a binary database, so the files are compressed. Each time
you do a checkout (git clone) from someone, you obtain a copy of the
entire repository, giving you the entire history right there at your
fingertips. Git was designed to be fast since the Linux kernel is quite
large. The Linux kernel also has many, many people working on it in a
cooperative manner; Git was designed to help them rather than hinder
them. SVN advertises that it makes branching easy – Git advertises that
it makes both branching and merging easy.

For those of us with one foot still in the past, Git provides
interface tools for both SVN and CVS repositories. In other words,
my local Git sandbox can be pointed to an SVN server and do uploads
and downloads using the SVN protocol. It can then be pointed to a
different SVN server – from the same sandbox! Magic!

How can two people use Git to work on the same project? Each person
would have their own Git sandbox with the entire repository in it.
If person A makes a change, person B could do a “git fetch” or “git pull”
to get the changes. Person B then adjusts those changes and makes some new
ones. Person A then does a “git pull” from B’s repository. These could
be simply files on the same machine in which they both have accounts,
or there are ways to set up public read-only servers via http or the
git protocol. Changes that don’t work out can be kept in a branch,
but never “pushed”, or they can be made to disappear completely as
long as you haven’t yet shared them with anyone.

Finally, git frees us from the “revision number” concept. It instead
signs each view of the repository with a cryptographically sound
“SHA1” string, encoding the time and owner, etc. of that repository.
An example SHA1 is 152aee0b44a104b07d40f4401e5f1ea1ea2fe1b0,
guaranteed to be unique among all git repositories.

I’d like to thank Brian Powell for showing me the way of the Git.

kate - March 24, 2009 @ 18:39 Fun with Tides- Comments (1)

Three years ago we did a Northeast Pacific simulation without tides. Ever since, we’ve been talking about adding tides to improve the vertical mixing over the shallow Bering Sea shelf. Here are a few of the things we’ve learned along the way.

First, there were steps we knew were necessary:

Obtain the best bathymetry available (thanks to Seth Danielson).
Change the minimum depth from 30 m to 10 m.
Apply tidal boundary conditions from the global OTPS results.
Incorporate Mike Foreman’s long-period tidal corrections.
Add Paul Budgell’s tidal potential terms.
Fix some MPI bugs in the various bits (see previous post).
Decrease the timestep for stability.

I was having stability issues when I went to a meeting and heard that tides and the quadratic bottom drag don’t always play nicely. Switching to a linear bottom drag gave me a stable solution – but one in which the tides were too large, especially in Bristol Bay. In fact, the tides were so large in Bristol Bay that the water depth would go negative, leading to trouble. What sort of trouble? If the model wants to grow ice, the amount of ice growth has a term with Hz in it. Negative Hz leads to all sorts of wacky behavior.

I went down the path of trying the WET_DRY option for a bit, uncovering interesting compiler bugs with it. However, the tidal amplitudes were off and we needed to do something about it. Enter spatially variable bottom drag, as suggested by the NOAA fellows. My current implementation uses RDRG_GRID and expects to read a 2-D field from the grid file – a more complete version would also have an ANA_RDRG option and ana_rdrg to back it up. Through some trial and error we came up with a function of depth, increasing as depths shallow from 1000 m, uniform for deeper waters. The largest values are a factor of ten larger than the background value, so I was pleasantly surprised when it ran. The resulting tidal amplitudes are “good enough” except in upper Cook Inlet.

How about the rest of the simulation? It turns out we were getting vastly too much vertical mixing over the Bering Sea shelf. This is with LMD_MIXING with both top and bottom boundary layers turned on. That combination isn’t bad without tides, but with tides the bottom boundary layer grows to encompass the entire water column, even in deeper water than it should. Turning off the bottom boundary layer or trying one of the GLS schemes caused the model to blow up, back to the old tidal bottom drag instability mode.

After a conversation with Sasha Shchepetkin, I modified the bottom drag so that it is reduced in shallow water, so as to obey a stability limit of:

dt rdrg/Hz <= 0.5 or 0.6 or rdrg <= 0.6 Hz/dt Success! I did have one episode where the tides once again got too big in Bristol Bay with the old negative depth consequences, but managed to get through it by going to half the timestep for some months. Note that the maximum allowed drag goes up as the timestep gets shorter. I believe the quadratic version of this would be: rdrg <= 0.6 Hz / (dt*sqrt(u*u+v*v)) though this again gave me tides that were too large in Bristol Bay. That's what we get for being greedy and asking for N=60 in 10 m of water: the Hz's get small.

kate - August 1, 2008 @ 17:22 Quiz for you- Comments (0)

If you really understand how parallel ROMS works, you can explain this situation:

I’ve got two serial runs, one with 1 tile, one with 4×1 tiles. I run ncdiff on the output files and after one timestep, the ice thickness looks like:
ice thickness of tiled simulation
and the free surface looks like:
free surface tile differences

Next, I run the exact same code but compiled for MPI, and run 1×4 tiles vs. 4×1 tiles. The differences now look like:
MPI ice thickness differences
and
free surface MPI differences

I understand the first of these problems and await a formal fix. The second of these is in my court, but first I have to find out how to get totalview to pass the command line argument through to ROMS. Added fun is that the 1×4 case runs with -O2, blows up during the first timestep with -g.

Edit: OK, fixed the MPI bug – the usual stupid nonsense – I had the mp_exchange call, but not the #include “cppdefs.h” so that DISTRIBUTE would be #defined.

Next problems: There’s something wacky in our new bulk_flux option. This is a diff between runs with 1×1 vs. 4×1, both with the MPI executable:
problem with parallel snow thickness
I’ll have to investigate tomorrow, also an MPI issue with the LMD_BKPP option.

kate - January 25, 2008 @ 0:15 Change is hard- Comments (0)

We completed a Northeast Pacific simulation almost two years ago now. There are things we like about it and things we’d like to improve on for a new simulation. There have been some changes to ROMS, changes to our bathymetry, boundary conditions, options chosen. So it comes time to give it all a whirl and see if the code still runs.

I have some initial troubles, but find that with a sufficiently short timestep, the code runs. There’s a bit of annoyance in that I ran a 3 km simulation with tides with a timestep of 240 s and I’m needing a timestep of 200 s for this 10 km grid. At least it runs, though it’s going to cost more than we had expected.

Then I hear from Ken Coyle about some wacky temperatures causing the biology to blow up. He told me it was happening at an i,j location at which I just happened to have a station. I find that he is indeed right, the temperature there gets temporarily quite bizarre:
temperature timeseries showing bad value
I plot up the whole vertical column of temperatures and find that they all get bad, as do the salinities:

After plotting every imaginable field, I find that I’m growing a negative amount of ice at that spot. The negative value is coming from a negative factor of Hz at that time and place. So, plotting total column water depth I see:
water depths at those stations
Very interesting – the water depth goes negative for several timesteps, but the model keeps on going. I didn’t know that was possible.

It’s time to try the shiny new WET_DRY option. My first attempts at it caused the model to blow up much earlier in the run (less than a day as opposed to running ten days). The time and place of that trouble was elsewhere in the domain where the total water depth wasn’t even close to zero. Further investigation showed that to be a compiler bug (goody, goody) to be avoided with less optimization. I can’t even report it to the vendor without a great deal more work pinning it down. Pathscale with -Ofast is trouble, using all the suboptions to -Ofast except -ipa is OK. Sunstudio with -O3 is also OK. The temperatures are now quite well-behaved too:
fixed temperatures

I tell Ken to update and to turn on WET_DRY, figuring that would solve all his problems. No such luck! His biology code is reporting wacky temperatures just prior to blowing up:

 Temperature:  37.5827072746595690 152 388 42 1 8.34259259259124519
6787409 36896.59491  1.577662E-03  2.255593E+04  2.255593E+04  5.268561E+16   0
 Temperature:  51.0785453469517208 152 388 39 1 8.34490740740875481
 Temperature:  39.2654092639337264 152 389 39 1 8.34490740740875481
6787410 36896.59722  1.584459E-03  2.255596E+04  2.255596E+04  5.268571E+16   0
 Temperature:  61.5939893767976798 152 388 6 1 8.34722222221898846
 Temperature:  73.0509220012720846 152 388 42 1 8.34722222221898846
 Temperature:  57.1291962882013564 152 389 42 1 8.34722222221898846
6787411 36896.59954  1.591257E-03  2.255598E+04  2.255599E+04  5.268581E+16   0
 Temperature:  79.0037572721587935 152 388 7 1 8.34953703703649808
 Temperature:  36.0570528159368351 152 388 8 1 8.34953703703649808
 Temperature:  132.177242769374715 152 388 39 1 8.34953703703649808
 Temperature:  99.5619134891402524 152 389 39 1 8.34953703703649808
6787412 36896.60185  1.598029E-03  2.255601E+04  2.255601E+04  5.268592E+16   0
 Temperature:  145.794858427701342 152 388 11 1 8.35185185184673173
 Temperature:  75.7987454267087344 152 388 12 1 8.35185185184673173
 Temperature:  234.923241638649216 152 388 31 1 8.35185185184673173
 Temperature:  245.081721186121740 152 388 42 1 8.35185185184673173
 Temperature:  208.911218725014379 152 389 42 1 8.35185185184673173
6787413 36896.60417  1.604735E-03  2.255604E+04  2.255604E+04  5.268602E+16   0
 Temperature:  104.828633142693036 152 388 25 1 8.35416666666424135
 Temperature:  70.9285478559007743 152 388 26 1 8.35416666666424135
 Temperature:  195.296389285105079 152 388 39 1 8.35416666666424135
 Temperature:  689.578533632940207 152 389 39 1 8.35416666666424135
6787414 36896.60648  1.611350E-03  2.255607E+04  2.255607E+04  5.268613E+16   0
 Temperature:  40.8787945360774572 152 388 9 1 8.35648148148175096
 Temperature:  45.6441714731643415 152 388 42 1 8.35648148148175096
 Temperature:  261.585913331404129 152 389 41 1 8.35648148148175096
 Temperature:  4469.44170242107884 152 389 42 1 8.35648148148175096
6787415 36896.60880  1.617861E-03  2.255610E+04  2.255610E+04  5.268624E+16   0

I asked Ken to turn on the stations for this run too and his stations temperatures never get outside the range of 3 to 7 degrees (this was a warmer initial condition file than I’d been using, but that isn’t relevant):
Ken's temperatures
The depth too completely stays in bounds though it gets shallow at the time of the trouble:
Ken's depths

So what is going on here? The temperatures are multiplied by Hz during the pre_step3d phase. The step2d code updates Hz with the new estimate based on a new water depth. The biology code extracts a temperature by dividing tracer t by Hz. If the new Hz is sufficiently different from the old Hz, the resulting value is not a sensible temperature. The biology needs a temperature for computing growth rates and values of 200 C were not considered to be inside the valid range.

What to do? One option would be to check the wet/dry mask and not compute biology when things are dry. However, it seems to me that the temperatures get a bit off several steps before we get to the minimum depth. Do we make the minimum depth bigger? Or do we save the old Hz and use that to compute a temperature for the biological terms?

Update:

I think we know what’s going on. The t(nnew) gets updated in pre_step3d to have the contribution from the vertical diffusion added on. In shallow water, this term can dominate the t*Hz term, meaning that t(nnew)/Hz is not a valid temperature. The same can be true for any of the other tracers as well. Looking in the debugger at a time/place where the water is shallow, we see that t(nrhs) for the upper points is:

(152,388,37,1,1)           4.590317...
(152,388,38,1,1)           4.59349
(152,388,39,1,1)           4.59674
(152,388,40,1,1)           4.59805
(152,388,41,1,1)           4.58889
(152,388,42,1,1)           4.57199

The water is within the range 4.4 to 4.6 from top top bottom, cooling at the surface due to atmospheric heat loss. The values of Hz for these points range from 0.0234 to 0.0237. The values for t(nnew) are:

(152,388,37,2,1)           0.1221898...
(152,388,38,2,1)           0.121764
(152,388,39,2,1)           0.121402
(152,388,40,2,1)          -0.247068
(152,388,41,2,1)           0.100149
(152,388,42,2,1)           0.4911977

You can do the math to see that t(nnew)/Hz will not result in values close to the target 4.58. The deep values where the vertical diffusion is small are pretty close to t*Hz_old, where Hz_old has values between 0.0257 and 0.0260. Hz is changing, but not enough to explain all that’s going near the surface. We need to be using t(nrhs) to get valid temperatures and other tracer values.