Question on subroutine netcdf_get_fvar from mod_netcdf.F

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Question on subroutine netcdf_get_fvar from mod_netcdf.F

#1 Unread post by stlaur »

Hello,
I have a small question regarding subroutine netcdf_get_fvar. This routine is called within get_ngfld.F around line 400 (SVN rev.491M, very recent) to obtain a one-dimensional array called 'A(:)', yet the 'start' and 'total' arguments each have two elements (as if A(:) was two-dimensional). It is this combination of 1-D and 2-D that confuses me, but I may be missing something trivial. Anyway, since 'A(:)' is 1-D, this call is transfered to subroutine netcdf_get_fvar_1d from module mod_netcdf.F. This particular routine tries to store the array 'total(:)' into a one-element array called 'Asize(1)' around line 1960, and this results in runtime error since size(total)=2 > size(Asize)=1.

I'm very curious about this; if someone understands I'd be very happy to know.
Best regards,
Pierre

User avatar
arango
Site Admin
Posts: 1349
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#2 Unread post by arango »

Yes, this routine is inside of a module and has an interface:

Code: Select all

      INTERFACE netcdf_get_fvar
        MODULE PROCEDURE netcdf_get_fvar_0d
        MODULE PROCEDURE netcdf_get_fvar_1d
        MODULE PROCEDURE netcdf_get_fvar_2d
        MODULE PROCEDURE netcdf_get_fvar_3d
        MODULE PROCEDURE netcdf_get_fvar_4d
      END INTERFACE netcdf_get_fvar
F90 will pick the correct routine according the argument array or scalar A. This is fancy programming and is always in Fortran; just check the intrinsic functions.

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#3 Unread post by kate »

Which field is it trying to read?

stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#4 Unread post by stlaur »

Thanks for your help!
ROMS is trying to get zeta_north(:,:) from my ocean_bry.nc file. My zeta_north(:,:) array has only one record in time, and the other dimension is space (size Lm+2). What happens then is that net_cdf_get_fvar is called over line 400 of get_ngfld:

Code: Select all

CALL netcdf_get_fvar (ng, mode, ncfile, Vname(1,ifield), A, ncid = ncid, start = (/1,Trec/), total = (/Iend-Istr+1,1/))
Here Trec=1 and Iend-Istr+1=Lm+2, which correctly corresponds to the size of my zeta_north(:,:) array.

The part I find confusing is that arguments start(:) and total(:) have two elements (since zeta_north is two-dimensional), while array A(:) is one-dimensional. As Arango mentioned, that means that netcdf_get_fvar_1d is picked by Fortran, and this particular routine tries to store the two-elements array 'total(:)' into a one-element array called 'Asize(1)' around line 1960, which results in runtime error since size(total)=2 > size(Asize)=1.

Many thanks for your help,
Pierre

stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#5 Unread post by stlaur »

Perhaps a minimal testcase would make things clearer:

Code: Select all

program testcase
  implicit none
  integer, parameter :: UBi = 403, LBi = -2, UBj = 1 ! Dummy values.
  integer            :: ng, model, ncid, Trec, Iend, Istr
  character(400)     :: ncfile, Vname
  real,    dimension((UBi-LBi+1)*UBj) :: A
! Dummy values for testcase.
  ng  =   1; model =  1; ncid = 0; Trec = 1; Istr = 0; Iend = 401
  ncfile = 'ocean_bry.nc'; Vname = 'zeta_north'; A = 0.
! Line 400 of ROMS/Utility/get_ngfld.F (SVN rev.491M).
  CALL netcdf_get_fvar_1d (ng, model, ncfile, Vname, &
                           A,                        &
                           ncid  = ncid,             &
                           start = (/1,Trec/),       &
                           total = (/Iend-Istr+1,1/))
contains
! From module ROMS/Modules/mod_netcdf.F (SVN rev.491M)
subroutine netcdf_get_fvar_1d (ng, model, ncname, myVarName, &
                               A,                            &
                               ncid,                         &
                               start,                        &
                               total)
  implicit none
  integer,      intent(in   )           :: ng, model
  integer,      intent(in   ), optional :: ncid, start(:), total(:)
  character(*), intent(in   )           :: ncname, myVarName
  real,         intent(  out)           :: A(:)
  integer,      dimension(1)            :: Asize
  IF (PRESENT(start).and.PRESENT(total)) THEN
    Asize=total ! <= line causing runtime error.
  ELSE
    Asize(1)=UBOUND(A, DIM=1)
  END IF
end subroutine netcdf_get_fvar_1d
end program testcase
I think it faithfully reproduces what happens. With Sun f95 I get:

Code: Select all

f95 -g -C testcase.f95
./a.out
 ******  FORTRAN RUN-TIME SYSTEM  ******
 An array expression does not conform : dim 1 is 2 , not 1
 Location:  line 30 column 11 of 'testcase.f95'
Abort
I get a similar message with gfortran -fbounds-check.
Best regards,
Pierre

User avatar
arango
Site Admin
Posts: 1349
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#6 Unread post by arango »

I suspect that the problem here is that the variable that you want to read from the NetCDF is only one-dimensional, as shown in the example above for zeta_north. This is incorrect. Any open boundary value for ROMS must be at least two-dimensional. If you check the metadata in Data/ROMS/CDL/bry_limit.cdl

Code: Select all

dimensions:
        xi_rho = ? ;                    ! we need to have values for these dimensions
        zeta_time = ? ;

        float zeta_north(zeta_time, xi_rho) ;
                zeta_north:long_name = "free-surface northern boundary condition" ;
                zeta_north:units = "meter" ;
                zeta_north:time = "zeta_time" ;
You always need to have the time dimension even if it is one. Any time-dependent input NetCDF variable to ROMS needs the associated time variable and dimension. The statement in get_ngfld.F is correct. I use this logic all the time. Numerous users of ROMS use boundary files and this is the first time that this is reported. Please check your NetCDF file and let us know.

stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#7 Unread post by stlaur »

Thanks for this reply! Unfortunately, I checked with ncdump -h, and by opening the file in Matlab, and zeta_north correctly has two dimensions (zeta_time and xi_rho), with one record in time.
Numerous users of ROMS use boundary files and this is the first time that this is reported.
I agree that this is strange. One reason might be that you need to check for out of bounds operation (-g -C with sun f95, or -fbounds-check with gfortran) to get the runtime error. In other words, one has to set USE_DEBUG to see something going wrong.

I'd be happy to test for other things if you have ideas. In the meantime, I must admit I am still puzzled about line 400 of get_ngfld.F. I mean, the testcase does generate an error, and I think the testcase is a close replicate of what happens over line 400 of get_ngfld.F. Am I missing something?

Thanks,

Pierre

User avatar
arango
Site Admin
Posts: 1349
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#8 Unread post by arango »

OK, then I need to be able to reproduce the problem. Please try other compilers to see if you get the problem. I am starting to suspect a compiler bug in gfortran. Which version of gfortran are you using?

stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#9 Unread post by stlaur »

I get the runtime error with:

gfortran (gcc version 4.3.2, from Linux Debian Lenny).

I also get the error on a different machine that uses:

f95: Sun Fortran 95 8.3 SunOS_i386 Patch 127002-04 2008/04/16

and it says

****** FORTRAN RUN-TIME SYSTEM ******
An array expression does not conform : dim 1 is 2 , not 1
Location: line 1315 column 15 of 'mod_netcdf.f90'
Abort

The line mentioned by the compiler is the line I highlighted in the testcase (Asize=total).
Note that I must set USE_DEBUG to "on" to get the error message.

Many thanks for this help.

Pierre

stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#10 Unread post by stlaur »

I would be very grateful if anybody that does not encounter this issue could describe what is happening in his roms code during the reading of boundary input data such as zeta_north(:,:). It can be simply a matter of inserting a couple of write(*,*) in your code to see what roms does. Just (please) make sure that you:

- start with a fresh make clean;
- activate USE_DEBUG in roms' makefile;
- add all relevant out-of-bounds checks for your compiler in file Compilers/my_OS-my_compiler.mk under the USE_DEBUG define;
- remove all parallelization (no MPI nor OpenMP);
- make sure that your roms application really has to open a netcdf boundary file.

In my case, this is how it goes. First, ROMS/Nonlinear/get_data.F is called by roms, and it looks for the zeta boundary conditions around line 591:

Code: Select all

!
!=======================================================================
!  Read in open boundary conditions from BOUNDARY NetCDF file.
!=======================================================================
!
# ifndef ANA_FSOBC
#  ifdef NORTH_FSOBC
      CALL get_ngfld (ng, iNLM, idZbry(inorth), ncBRYid(ng), 1,         &
     &                BRYname(ng), update(1),                           &
     &                ILB, IUB, 1, 2, 0, Lm(ng)+1, 1,                   &
     &                BOUNDARY(ng) % zetaG_north(ILB,1))
#  endif
# endif
So the routine ROMS/Utility/get_ngfld.F is called by roms. Then, at around line 400 of get_ngfld.F, the routine netcdf_get_fvar is requested:

Code: Select all

              CALL netcdf_get_fvar (ng, model, ncfile, Vname(1,ifield), &
     &                              A,                                  &
     &                              ncid = ncid,                        &
     &                              start = (/1,Trec/),                 &
     &                              total = (/Iend-Istr+1,1/))
In my version of roms, argument A is a one-dimensional array, which means that the call is answered by subroutine netcdf_get_fvar_1d from ROMS/Modules/mod_netcdf.F. Right at the beginning of netcdf_get_fvar_1d, I get a out-of-bound operation when it tries to execute Asize=total. This occurs because total(:) has two elements, while Asize is of size 1.

I'm particularly interested in knowing if your roms code follows the same pathway (i.e. if it calls the same routines), and what are the dimensions of the arguments in the call to these subroutines. In other words, how does your roms code avoid the out-of-bound operation that I described over this thread? I would highly appreciate if someone could help me understand this.

Many thanks,

Pierre

User avatar
arango
Site Admin
Posts: 1349
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#11 Unread post by arango »

I finally have the time to check what you were reporting in the debugger. I tried several compilers and I was not able to reproduce the problem. However, I can see what you are talking and indeed this is a bug. It is amazing the things that modern compilers accept nowadays. Very scary... The logic here is kind of delicate because netcdf_get_fvar_1d can be used to read multi-dimensional data compacted into a 1D temporary array. This is the case when reading the open boundary data.

Many thanks for reporting this bug and your persistence with it :!:

Please update. I think that I fixed your problem. Here is the :arrow: ticket.

stlaur
Posts: 30
Joined: Sun Jun 27, 2010 8:45 pm
Location: Old Dominion University

Re: Question on subroutine netcdf_get_fvar from mod_netcdf.F

#12 Unread post by stlaur »

I updated my code and it works great!
Thank you very much for providing this fix.

Best regards,

Pierre

Post Reply