ROMS/TOMS Developers

Algorithms Update Web Log
« Previous PageNext Page »

kate - July 19, 2010 @ 20:01
More source code control- Comments (3)

I’ve been using git and I’m quite happy with it. Meanwhile, there’s a competing project with many, many of the same attributes of git, called Mercurial. They pride themselves on a cleaner user interface, so you might like it better.

We’ve now got a longer commute and my husband found some podcasts to listen to at twit.tv (This Week in Tech). We like the TWIT shows, but they tend to focus on things like smart phones and iPads. But they’ve got quite a selection of other shows, including FLOSS, which is about free software. There was a FLOSS show about Mercurial, explaining that it and git were started on the same day to solve the same problem. Mercurial not only has a way to access svn archives, but one to access git archives. Mercurial is written in Python and is being used by the Python community (which explains why Rob Hetland thought Mercurial was getting more traction than git). The result is that both are probably here for the long haul.


kate - June 25, 2010 @ 0:05
Still More About Git- Comments (0)

Note: this is a copy of something submitted to the ARSC HPC newsletter.

I’ve written about git here twice before, where I described needing
to deal with legacy svn repositories. Perhaps it’s time to get a
little more specific about how I’m using it so you can all be
suitably horrified.

First of all, I started playing with git before I found a system
with a working git-svn command. I created a repository with a main
trunk which mimics the svn code, plus a branch with exploratory code
we’re not ready to share via svn just yet. I have these codes in a bare
“origin” repository in my home directory on my desktop system. A
bare repository has the database, but no working files – it is
unsafe to push to a non-bare repository.

I can do a “git clone” of the “origin” out to any of the supercomputers
here at ARSC. The “origin” repository doesn’t know about any of the
clones, but each clone can pull updates from the “origin” and push
changes back into it. Good practice means making sure the “origin”
is always up to date with any changes made anywhere.

It turns out the system with the working git-svn is a laptop, though
security prevents the laptop from hosting the “origin”. The laptop
also has a clone of “origin” in addition to the repository generated
by “git svn clone” and a third repository from “git svn clone” on
a colleague’s trunk version. These three repositories know something
about each other. How does that work? Say we create them with:

     git clone  git_code
     git svn clone  svn_code
     git svn clone  trunk_code

We’ll assume that everything is on the master branch below, but it
could be on some other active branch.

Remotes

I can go into git_code and type:

     git remote add svn_code ../svn_code
     git remote add trunk_code ../trunk_code

Likewise, I go into svn_code and type:

     git remote add git_code ../git_code
     git remote add trunk_code ../trunk_code

Because my view of trunk_code is read-only, I have no need for it to
know about the others.

Note that the “git clone” operation automatically generates a remote
called “origin”.

Remote Tracking Branches

I’m not quite set up yet to use the remotes. They need to be turned
into remote tracking branches, branches in the local repository
which point to the remotes. To do this, go into each directory with
remotes and type:

    git remote update

This will create a new tracking branch for each of the remote sites.
These branches show up with “git branch -r” or “git branch -a” but
not with simply “git branch”.

Incorporating Svn Updates

Suppose you decide to check for updates from the main trunk, after a
period of weeks (or years) and you want to bring those changes to
your code. Using svn, an “svn update” would bring a copy of that trunk
directory up to date all in one fell swoop. Ditto with a merge from the
old to the current, even if some dozens of updates have been checked in.

Using git, you invoke the following from the trunk_code directory:

     git svn rebase

This will bring over the changes one at a time, with the svn
check-in message attached to each change.

Now, go to the svn_code directory and type:

    git remote update

Now you can apply the changes via “git cherry-pick”, one change at a
time. Conflicts get resolved as they are encountered, commit by commit.
This could get tedious if you let it go too long…

Once that’s done, it’s time to commit these changes to your svn
repository:

    git svn dcommit

In the git_code directory, repeat the “git remote update” and the
“git cherry-pick” commands, then:

    git commit
    git push origin master

Now your workstation’s “origin” is up to date as well.

Incorporating Git Updates

Sometimes the changes come from you, over on one of the
supercomputers. On that supercomputer, type:

    git push origin master

Now the changes are on the workstation, but not the laptop. On the
laptop in the git_code directory, type:

    git pull origin master

Now go to the svn_code direcotory and:

    git remote update
    git cherry-pick 
    git svn dcommit

This brings your changes over to your svn branch.

Last Thoughts

If I were to start fresh, I would start with the “git svn clone” operation
and create a bare clone of that on the workstation. I could still do
that if I knew how to pull the development branch code from the git_code
to the svn_code repository (any hints from you all?). I’m still enjoying
using git, though it makes me feel like an old dog faced with many
new tricks!

Also, I talked about git at a ROMS meeting and that talk is available
here.


kate - April 17, 2010 @ 16:17
Hacker’s Dream (Python for ROMS, part 0)- Comments (0)

A friend told me he can always count on me to do something outside the norm. All my colleagues use Matlab? Well, I can dream of a world in which we can do all the ROMS pre- and post-processing in an Open Source manner. I’ve played a bit with NCL and I really like some aspects of it. However, it doesn’t allow you to build tools that have a gui (graphical user interface). Grid generation and land-mask editing are two examples where a gui is really, really nice.

Rob Hetland and others have been working with Python to build up a toolbox for working with ROMS NetCDF files. Before we get to the tools, let’s talk about what they require:

  • Python itself. You will need a version between 2.4 and 2.6, inclusive. There is a newer 3.x series, but it is incompatible and therefore not to be trusted just yet.
  • numpy. Get a version that’s reasonably new, don’t count on the version 1.0.1 that you found already on the system. How do you tell the version? At the interactive prompt from python, type “import numpy” then “numpy.__version__”.
  • scipy.
  • matplotlib. This is a plotting package to reproduce the Matlab plotting, complete with everything you might not like about Matlab plotting. This also contains the gui tools, but it depends on an underlying gui package, be that tk, wx, X11, or qt. If it can’t find any of them, you are out of luck.
  • basemap. These are the map tools for Python, complete with etopo2 in the examples directory (user beware).
  • netCDF4. This sits on top of the hdf5 and netcdf4 libraries.
  • ipython. Fred Castruccio recommends using “ipython -pylab” for interactive fun. It preloads both numpy and the matplotlib pylab package.

So how does one go about acquiring all this stuff? There are choices, from a prebuilt package with everything from Enthought, to fetching each as a package for your OS, to building everything from scratch. I’ve honestly had a bit of a challenge for various reasons, but here are a few things I’ve learned:

  • Python 2.6.5 does not compile on the Mac with either gcc-4.0.1 or gcc-4.4.0, but requires gcc-4.5.0 to get around an internal compiler error when building the datetime module. Matplotlib will not load without datetime.
  • Python itself uses the “standard” method of “configure; make; make install”, but after that, all packages play by the Python rules.
  • The python rules are:
    1. In the directory of the package you want to install, type “python setup.py build”. This puts files into a local build directory.
    2. Type “python setup.py install” with an optional “–prefix={dir}” option.
    3. Type “python setup.py config” to see stuff like what guis matplotlib has found.
    4. If you use the optional non-standard directory, tell Python about it with the PYTHONPATH environment variable.
    5. If you want to link to packages that are themselves in a nonstandard place, use the environment variables CPPFLAGS=-I{dir} and LDFLAGS=-L{dir}. This adds to the list of places it will look – I don’t know how to take away from the list.
  • The Python package for Mac comes as a “universal binary”, working on both ppc and i386 architectures. That means that any other libraries that are involved, such as hdf5 and netcdf4, also have to be compiled this way or you will get an error.
  • If you get rough packages from Fred, make sure to delete the pre-existing build directory and any other binary object files you might find.

See? It really is a hacker’s dream, with fifty ways to mess up. 😉


kate - January 5, 2010 @ 2:04
Binary tree follow-up- Comments (0)

Back in June I wrote about a balanced binary tree problem I was having. We have since come up with a C++ solution and it’s working really well, though we have to link to some extra C++ stuff.

Today, out of the blue, I got an email from David Car who not only read the original blog post, but sent me a version which actually runs! This is what he had to say:

I came across your Fortran implementation of the Red Black Tree while reading a post by Mike Page on LinkedIn. I’ve attached an implementation that works built on what you did. At the moment I’m scratching my head as to why I had to do what I did in the attached code. Basically, I tracked down that in rotate_left (and rotate_right), the pointer x was reassigned to x%parent after the line

y%parent => x%parent

i.e. between your print statements. I don’t know why that happens. I discovered this by stepping through that section with gdb and also ran valgrind on it. You probably noticed that too. What I did was to simply treat the dummy argument to rotate_left(…) as simply a treenode with the target attribute rather than a pointer. I then use a local pointer x to point to it and it works fine. I often do this because it allows me to pass in a pointer or a an actual type as a dummy argument, but in this case it fixed the problem. I’m trying to track down why this is and will let you know what I find. I did some rearranging of the code and created a RedBlackTree type and a few other things.

BTW, I say your recent blog post on git. I think git is the best version control out there. I hope you find the same.

He later sent me this link which explains what happened.

Who is this person, you ask? This is what he’s up to:

BTW, I have written a templating preprocessor for Fortran 95/2003. I’m not sure how familiar you are with generic programming, but since you’re using C++, you are most likely knowledgeable in the Standard Template Library. I know of two other projects that try to provide this kind of capability in Fortran: Forpedo and Parametric Fortran. My project tried to achieve a more native look and feel to the language. The pre-processor is written in Python and I’m working on the Wiki. I have an ACM Fortran Forum article coming up in April on it. The main site is:

blockit

You’ll want to look at the Wiki for more documentation (link is in the upper left). The project comes with PyF95++ which is the templating preprocessor front end. It also included a pretty good start to a standard template library in Fortran. It has different types of linked lists, hashtable, pairs, etc. that are all generic containers, i.e. templated. It also has a unit testing framework for Fortran. You may have colleagues that could use such functionality in Fortran. It’s all under the MIT license. All the best.

Edit: A few days, a few iterations later, the code now looks like:

MODULE mod_tree
!
!================================================== Kate Hedstrom ======
!    Fixes and improvements by David Car (david.car7@gmail.com)        !
!=======================================================================
!  Copyright (c) 2002-2010 The ROMS/TOMS Group                         !
!    Licensed under a MIT/X style license                              !
!    See License_ROMS.txt                                              !
!=======================================================================
!  Set up tree structure and functions.                                !
!=======================================================================
!
  implicit none  !................................................................................
  type treenode
     type(treenode), pointer :: left => null()
     type(treenode), pointer :: right => null()
     type(treenode), pointer :: parent => null()
     logical :: red = .FALSE.
     integer :: eggs = 0 
     real(kind=8)  :: dist = 0.d0
  end type treenode

  type RedBlackTree
     type(treenode), pointer :: root => null()
     integer :: nitems = 0
  end type RedBlackTree
  !................................................................................

  !................................................................................
  ! Global nil node for leaves and parent of root.
  !................................................................................
  type(treenode), target, save :: nil

  PUBLIC :: insert, RedBlackTree
  PRIVATE :: nil

CONTAINS

  SUBROUTINE init(this)
    type (RedBlackTree) :: this
    this%root => nil
  END SUBROUTINE init

  !................................................................................  SUBROUTINE insert(this, eggs, dist)
    !..............................................................................    
    ! Insert a node into the tree
    !..............................................................................
    type (RedBlackTree), target :: this
  
    integer, intent(in) :: eggs
    real(kind=8), intent(in) :: dist
    type(treenode), pointer :: cur, p, x, y

    ! Empty tree, deposit eggs at the top
    if (.not. associated(this % root)) return

    ALLOCATE(cur)
    cur % eggs = eggs
    cur % dist = dist
    cur % left => nil
    cur % right => nil
    this % nitems = this % nitems + 1

    IF (ASSOCIATED(this % root, nil)) THEN
       this % root => cur
       this % root % parent => nil
       RETURN
    ENDIF

    ! Otherwise find somewhere to put these eggs
    ! New nodes end up at the bottom until a rebalance
    p => this%root

    DO
       IF (dist <= p % dist) THEN
          IF (.not. isLeaf(p % left)) THEN
             p => p % left
             CYCLE
          ELSE 
             p % left => cur
             cur % parent => p
             EXIT
          END IF
       ELSE
          IF (.not. isLeaf(p % right)) THEN
             p => p % right
             CYCLE 
          ELSE
             p % right => cur
             cur % parent => p
             EXIT
          END IF
       END IF
    END DO
    ! Balance the thing... red-black for now, until I get smarter about
    ! balancing eggs.
    cur % red = .true.
    x => cur
    p => null()

    DO WHILE (x % parent % red)
       IF (ASSOCIATED(x % parent % parent % left, x % parent)) THEN
          y => x % parent % parent % right   ! uncle
          IF (y % red) THEN
             x % parent % red = .false.
             y % red = .false.
             x % parent % parent % red = .true.
             x => x % parent % parent
          ELSE
             IF (ASSOCIATED(x, x % parent % right)) THEN
                x => x % parent
                CALL rotate_left(this, x)
             END IF
             x % parent % red = .false.
             x % parent % parent % red = .true.
             CALL rotate_right(this, x % parent % parent)
          END IF
       ELSE
          ! Must be right grandchild to get here
          y => x % parent % parent % left    ! aunt
          IF (y % red) THEN
             x % parent % red = .false.
             y % red = .false.
             x % parent % parent % red = .true.
             x => x % parent % parent
          ELSE 
             IF (ASSOCIATED(x, x % parent % left)) THEN
                x => x % parent
                CALL rotate_right(this, x)
             END IF
             x % parent % red = .false.
             x % parent % parent % red = .true.
             CALL rotate_left(this, x % parent % parent)
          END IF
       END IF
    END DO

    this % root % red = .false.

  END SUBROUTINE insert

  !
  ! For the rotating, I’m working from C code I found online for red-black trees,
  ! with reference to Introduction to Algorithms by Cormen, Leiserson,
  ! Rivest (Chapter 14). It makes right child of x into the parent of x.

  !................................................................................
  SUBROUTINE rotate_left(this, x_)
    !..............................................................................
    ! Rotate node x_ to the left in tree `this`
    !..............................................................................
    type(RedBlackTree), intent(inout) :: this
    type(treenode), target, intent(inout) :: x_
    type(treenode), pointer :: x => null()
    type(treenode), pointer :: y => null()
    integer :: mine, theirs

    x => x_
    y => x % right
    x % right => y % left

    IF (.not. isLeaf(y % left)) THEN
       y % left % parent => x
    END IF

    y % parent => x % parent

    IF (ASSOCIATED(x % parent, nil)) THEN
       this % root => y
    ELSE
       IF (ASSOCIATED(x, x % parent % left)) THEN
          x % parent % left => y
       ELSE
          x % parent % right => y
       ENDIF
    END IF
    y % left => x
    x % parent => y
  END SUBROUTINE rotate_left
    
  !................................................................................
  SUBROUTINE rotate_right(this, x_)
    !.............................................................................. 
    ! Rotate node x_ to the right in tree `this`
    !..............................................................................
    type(RedBlackTree), intent(inout) :: this
    type(treenode), target, intent(inout) :: x_
    type(treenode), pointer :: x => null()
    type(treenode), pointer :: y => null()

    x => x_
    y => x % left
    x % left => y % right
    
    IF (.not. isLeaf(y % right)) THEN
       y % right % parent => x
    END IF

    y % parent => x % parent

    IF (ASSOCIATED(x % parent, nil)) THEN
       this % root => y
    ELSE
       IF (ASSOCIATED(x, x % parent % right)) THEN
          x % parent % right => y
       ELSE
          x % parent % left => y
       ENDIF
    END IF
    y % right => x
    x % parent => y
  END SUBROUTINE rotate_right

  !................................................................................
  ! None of these are used at the moment
  !................................................................................

  !................................................................................
  FUNCTION isLeft(x, y) result(b)
    !..............................................................................
    ! Check if node y is left child of x
    !..............................................................................
    type (treenode), pointer :: x, y
    logical :: b
    
    b = ASSOCIATED(x % left, y)
  END FUNCTION isLeft

  !................................................................................
  FUNCTION isRight(x, y) result(b)
    !..............................................................................
    ! Check if node y is right child of x
    !..............................................................................
    type (treenode), pointer :: x, y
    logical :: b

    b = ASSOCIATED(x % right, y)
  END FUNCTION isRight

  !................................................................................
  FUNCTION isLeaf(x) result(b)
    !..............................................................................
    ! Check if node x is a leaf
    !..............................................................................
    type (treenode), pointer :: x
    logical :: b
    b = ASSOCIATED(x, nil)
  END FUNCTION isLeaf
END MODULE mod_tree


!................
! Test code
!................
program main
  use mod_tree
  implicit none
  integer, parameter :: num = 8
  type(RedBlackTree) :: tree
  integer :: caviar(num)
  real(kind=8)  :: dist(num)
  integer :: i

  dist = (/ 21, 2, 3, 56, 78, 5, 7, 4 /)
  caviar = 100*dist

  call init(tree)

  do i=1,num
     call insert(tree, caviar(i), dist(i))
  end do

  print *, 'Done'

end program main

kate - January 3, 2010 @ 23:01
More on git- Comments (0)

Back in a prior post, I wrote an introduction to git, a new open source version control package. I’ve since been learning more about it, especially reading much of the O’Reilly book “Version Control with Git” by Jon Loeliger over the holiday break. There are a few things which I’ve found just a little confusing or surprising coming from a cvs/svn background, in addition to the concept of a distributed repository system. I’d like to talk about one just a little, but first, I’d like to apologize for this incorrect statement from last time:

It can then be pointed to a different SVN server – from the same sandbox! Magic!

It turns out that’s not true – a git sandbox can point to any number of remote git repositories, but only to at most one remote svn server. All is not lost, however, since you can have two git sandboxes, each pointing to one svn server, but as git remotes of each other. Thanks to Brian Powell for sending me
this link.

I actually think you can point to two branches in the same svn site from one git directory, but you have to tell it how the branches are layed out in the “git svn clone” operation. I haven’t tried it yet, though.

The Index
The concept of the index is new to me, and something I didn’t come to appreciate until working with git for a bit. It is a staging area for building up your next commit, between the working directory and the repository. You can have changes going on which should logically be checked in separately – you add the first set to the index with “git add file1”, then commit with “git commit”, leaving changes to file2 to be checked in later, assuming the changes aren’t all in the same file.

Also, “git diff” is a diff between the working directory and the index, not a diff between the working directory and the latest commit (HEAD), as is the case with cvs and svn. To get the diff from the last commit, use instead “git diff HEAD”. Likewise, “git diff –cached” is the diff between the HEAD and the index, showing what would be checked in with a “git commit”.

Also, when doing a “git pull” or a “git merge”, only the conflicts show up with “git diff”. I’ve been caught by this one, thinking I knew what all had arrived in a “git pull” from a colleague! I think the advice of sticking to “git fetch” rather than “git pull” is one I’ll try. By the way, “git pull” is a “git fetch” followed by a “git merge”.

I’m still committed to getting more comfortable with git and using it more intelligently. We got a proposal funded which will involve me having to work with svn at two remote sites in two different states, one of which is now woefully out of date. Call it a New Year’s resolution to get that fixed with git!