ROMS not running on 1024 cores for 17532 iterations

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

ROMS not running on 1024 cores for 17532 iterations

#1 Post by prakrati » Fri Nov 12, 2010 11:41 am

Hello

I ran ocean_benchmark4.in on 1024 cores for 17532 iterations but its giving NetCDF output error
Kindly help

Resource usage summary:

CPU time : 1.56 sec.
Max Memory : 5 MB
Max Swap : 36 MB


The output (if any) follows:

Code: Select all

Process Information:

 Model Input Parameters:  ROMS/TOMS version 3.2  
                          Friday - November 12, 2010 - 11:36:24 AM
 -----------------------------------------------------------------------------

 Benchmark Test, Idealized Southern Ocean, Large Grid

 Operating system : Linux
 CPU/hardware     : x86_64
 Compiler system  : ifort
 Compiler command : /app/intel/impi/3.1/bin64/mpiifort
 Compiler flags   : -ip -O3 -no-prec-div -xT -ftz -fno-alias -funroll-loops  -ip -O3 -xW -free

 Input Script  : /homenc/ipaes84/optimisedROMS/ROMS/ROMS/roms-3.1/ROMS/ocean_benchmark4.in

 SVN Root URL  : https://www.myroms.org/svn/src/trunk
 SVN Revision  : 

 Local Root    : /homenc/ipaes84/optimisedROMS/ROMS/ROMS/roms-3.1
 Header Dir    : /homenc/ipaes84/optimisedROMS/ROMS/ROMS/roms-3.1/ROMS/Include
 Header file   : benchmark.h
 Analytical Dir: /homenc/ipaes84/optimisedROMS/ROMS/ROMS/roms-3.1/ROMS/Functionals

 Resolution, Grid 01: 4192x0272x030,  Parallel Nodes: ***,  Tiling: 128x008


 Physical Parameters, Grid: 01
 =============================

      17532  ntimes          Number of timesteps for 3-D equations.
    150.000  dt              Timestep size (s) for 3-D equations.
         20  ndtfast         Number of timesteps for 2-D equations between
                               each 3D timestep.
          1  ERstr           Starting ensemble/perturbation run number.
          1  ERend           Ending ensemble/perturbation run number.
          0  nrrec           Number of restart records to read from disk.
          T  LcycleRST       Switch to recycle time-records in restart file.
      17532  nRST            Number of timesteps between the writing of data
                               into restart fields.
          1  ninfo           Number of timesteps between print of information
                               to standard output.
          T  ldefout         Switch to create a new output NetCDF file(s).
      17532  nHIS            Number of timesteps between the writing fields
                               into history file.
      17532  ndefHIS         Number of timesteps between creation of new
                               history files.
 5.0000E+02  tnu2(01)        Horizontal, harmonic mixing coefficient (m2/s)
                               for tracer 01: temp
 5.0000E+02  tnu2(02)        Horizontal, harmonic mixing coefficient (m2/s)
                               for tracer 02: salt
 5.0000E+03  visc2           Horizontal, harmonic mixing coefficient (m2/s)
                               for momentum.
 1.0000E-05  Akt_bak(01)     Background vertical mixing coefficient (m2/s)
                               for tracer 01: temp
 1.0000E-05  Akt_bak(02)     Background vertical mixing coefficient (m2/s)
                               for tracer 02: salt
 1.0000E-04  Akv_bak         Background vertical mixing coefficient (m2/s)
                               for momentum.
 3.0000E-04  rdrg            Linear bottom drag coefficient (m/s).
 3.0000E-03  rdrg2           Quadratic bottom drag coefficient.
 2.0000E-02  Zob             Bottom roughness (m).
 1.0000E+01  blk_ZQ          Height (m) of surface air humidity measurement.
 1.0000E+01  blk_ZT          Height (m) of surface air temperature measurement.
 1.0000E+01  blk_ZW          Height (m) of surface winds measurement.
          1  lmd_Jwt         Jerlov water type.
          1  Vtransform      S-coordinate transformation equation.
          1  Vstretching     S-coordinate stretching function.
 5.0000E+00  theta_s         S-coordinate surface control parameter.
 4.0000E-01  theta_b         S-coordinate bottom  control parameter.
    200.000  Tcline          S-coordinate surface/bottom layer width (m) used
                               in vertical coordinate stretching.
   1025.000  rho0            Mean density (kg/m3) for Boussinesq approximation.
      0.000  dstart          Time-stamp assigned to model initialization (days).
       0.00  time_ref        Reference time for units attribute (yyyymmdd.dd)
 0.0000E+00  Tnudg(01)       Nudging/relaxation time scale (days)
                               for tracer 01: temp
 0.0000E+00  Tnudg(02)       Nudging/relaxation time scale (days)
                               for tracer 02: salt
 0.0000E+00  Znudg           Nudging/relaxation time scale (days)
                               for free-surface.
 0.0000E+00  M2nudg          Nudging/relaxation time scale (days)
                               for 2D momentum.
 0.0000E+00  M3nudg          Nudging/relaxation time scale (days)
                               for 3D momentum.
 0.0000E+00  obcfac          Factor between passive and active
                               open boundary conditions.
     10.000  T0              Background potential temperature (C) constant.
     35.000  S0              Background salinity (PSU) constant.
      1.000  gamma2          Slipperiness variable: free-slip (1.0) or 
                                                    no-slip (-1.0).
          T  Hout(idFsur)    Write out free-surface.
          T  Hout(idUbar)    Write out 2D U-momentum component.
          T  Hout(idVbar)    Write out 2D V-momentum component.
          T  Hout(idUvel)    Write out 3D U-momentum component.
          T  Hout(idVvel)    Write out 3D V-momentum component.
          T  Hout(idWvel)    Write out W-momentum component.
          T  Hout(idOvel)    Write out omega vertical velocity.
          T  Hout(idTvar)    Write out tracer 01: temp
          T  Hout(idTvar)    Write out tracer 02: salt
          T  Hout(idTsur)    Write out surface net heat flux.
          T  Hout(idTsur)    Write out surface net salt flux.
          T  Hout(idSrad)    Write out shortwave radiation flux.
          T  Hout(idLrad)    Write out longwave radiation flux.
          T  Hout(idLhea)    Write out latent heat flux.
          T  Hout(idShea)    Write out sensible heat flux.
          T  Hout(idDano)    Write out density anomaly.
          T  Hout(idHsbl)    Write out depth of surface boundary layer.

 Output/Input Files:

             Output Restart File:  ./output/ocean_rst.nc
        Prefix for History Files:  ./output/ocean_his

 Tile partition information for Grid 01:  4192x0272x0030  tiling: 128x008

 
        
       

 Maximum halo size in XI and ETA directions:

               HaloSizeI(1) =     135
               HaloSizeJ(1) =     138
                TileSide(1) =      40
                TileSize(1) =    1560


 Activated C-preprocessing Options:

  BENCHMARK          Benchmark Test, Idealized Southern Ocean, Large Grid
  ALBEDO             Shortwave radiation from albedo equation.
  ANA_BSFLUX         Analytical kinematic bottom salinity flux.
  ANA_BTFLUX         Analytical kinematic bottom temperature flux.
  ANA_CLOUD          Analytical cloud fraction.
  ANA_GRID           Analytical grid set-up.
  ANA_HUMIDITY       Analytical surface air humidity.
  ANA_INITIAL        Analytical initial conditions.
  ANA_PAIR           Analytical surface air pressure.
  ANA_RAIN           Analytical rain fall rate.
  ANA_SRFLUX         Analytical kinematic shortwave radiation flux.
  ANA_SSFLUX         Analytical kinematic surface salinity flux.
  ANA_WINDS          Analytical surface wind components.
  ASSUMED_SHAPE      Using assumed-shape arrays.
  BULK_FLUXES        Surface bulk fluxes parametererization.
  CURVGRID           Orthogonal curvilinear grid.
  DJ_GRADPS          Parabolic Splines density Jacobian (Shchepetkin, 2002).
  DOUBLE_PRECISION   Double precision arithmetic.
  EW_PERIODIC        East-West periodic boundaries.
  LMD_CONVEC         LMD convective mixing due to shear instability.
  LMD_MIXING         Large/McWilliams/Doney interior mixing.
  LMD_NONLOCAL       LMD convective nonlocal transport.
  LMD_RIMIX          LMD diffusivity due to shear instability.
  LMD_SKPP           KPP surface boundary layer mixing.
  LONGWAVE           Compute net longwave radiation internally.
  MIX_GEO_TS         Mixing of tracers along geopotential surfaces.
  MIX_S_UV           Mixing of momentum along constant S-surfaces.
  MPI                MPI distributed-memory configuration.
  NONLINEAR          Nonlinear Model.
  NONLIN_EOS         Nonlinear Equation of State for seawater.
  NORTHERN_WALL      Wall boundary at Northern edge.
  POWER_LAW          Power-law shape time-averaging barotropic filter.
  PROFILE            Time profiling activated .
  !RST_SINGLE        Double precision fields in restart NetCDF file.
  SALINITY           Using salinity.
  SOLAR_SOURCE       Solar Radiation Source Term.
  SOLVE3D            Solving 3D Primitive Equations.
  SOUTHERN_WALL      Wall boundary at Southern edge.
  SPLINES            Conservative parabolic spline reconstruction.
  SPHERICAL          Spherical grid configuration.
  TS_U3HADVECTION    Third-order upstream horizontal advection of tracers.
  TS_C4VADVECTION    Fourth-order centered vertical advection of tracers.
  TS_DIF2            Harmonic mixing of tracers.
  UV_ADV             Advection of momentum.
  UV_COR             Coriolis term.
  UV_U3HADVECTION    Third-order upstream horizontal advection of 3D momentum.
  UV_C4VADVECTION    Fourth-order centered vertical advection of momentum.
  UV_QDRAG           Quadratic bottom stress.
  UV_VIS2            Harmonic mixing of momentum.
  VAR_RHO_2D         Variable density barotropic mode.

 INITIAL: Configuring and initializing forward nonlinear model ...


 Vertical S-coordinate System: 

 level   S-coord     Cs-curve          at_hmin  over_slope     at_hmax

    30   0.0000000   0.0000000           0.000       0.000       0.000
    29  -0.0333333  -0.0024174          -7.392     -11.622     -15.853
    28  -0.0666667  -0.0052838         -14.918     -24.165     -33.412
    27  -0.1000000  -0.0087922         -22.638     -38.024     -53.410
    26  -0.1333333  -0.0131904         -30.624     -53.707     -76.790
    25  -0.1666667  -0.0187972         -38.973     -71.868    -104.763
    24  -0.2000000  -0.0260168         -47.805     -93.334    -138.864
    23  -0.2333333  -0.0353508         -57.272    -119.136    -181.000
    22  -0.2666667  -0.0473981         -67.553    -150.499    -233.446
    21  -0.3000000  -0.0628318         -78.850    -188.805    -298.761
    20  -0.3333333  -0.0823381         -91.368    -235.460    -379.551
    19  -0.3666667  -0.1065031        -105.284    -291.665    -478.045
    18  -0.4000000  -0.1356491        -120.695    -358.081    -595.466
    17  -0.4333333  -0.1696534        -137.563    -434.456    -731.349
    16  -0.4666667  -0.2078237        -155.680    -519.372    -883.063
    15  -0.5000000  -0.2489214        -174.676    -610.289   -1045.901
    14  -0.5333333  -0.2913811        -194.081    -703.998   -1213.915
    13  -0.5666667  -0.3336756        -213.436    -797.368   -1381.301
    12  -0.6000000  -0.3746809        -232.404    -888.096   -1543.788
    11  -0.6333333  -0.4138998        -250.837    -975.161   -1699.486
    10  -0.6666667  -0.4514899        -268.780   -1058.888   -1848.995
     9  -0.7000000  -0.4881475        -286.444   -1140.702   -1994.960
     8  -0.7333333  -0.5249360        -304.147   -1222.785   -2141.423
     7  -0.7666667  -0.5631362        -322.274   -1307.762   -2293.251
     6  -0.8000000  -0.6041494        -341.245   -1398.506   -2455.768
     5  -0.8333333  -0.6494565        -361.504   -1498.053   -2634.601
     4  -0.8666667  -0.7006196        -383.519   -1609.604   -2835.688
     3  -0.9000000  -0.7593114        -407.793   -1736.588   -3065.383
     2  -0.9333333  -0.8273620        -434.875   -1882.759   -3330.642
     1  -0.9666667  -0.9068164        -465.378   -2052.307   -3639.236
     0  -1.0000000  -1.0000000        -500.000   -2250.000   -4000.000

 Time Splitting Weights: ndtfast =  20    nfast =  29

    Primary            Secondary            Accumulated to Current Step

  1-0.0009651193358779 0.0500000000000000-0.0009651193358779 0.0500000000000000
  2-0.0013488780126037 0.0500482559667939-0.0023139973484816 0.1000482559667939
  3-0.0011514592651644 0.0501156998674241-0.0034654566136460 0.1501639558342180
  4-0.0003735756740661 0.0501732728306823-0.0038390322877121 0.2003372286649003
  5 0.0009829200513762 0.0501919516143856-0.0028561122363360 0.2505291802792859
  6 0.0029141799764308 0.0501428056118168 0.0000580677400949 0.3006719858911027
  7 0.0054132615310267 0.0499970966129953 0.0054713292711216 0.3506690825040980
  8 0.0084687837865133 0.0497264335364439 0.0139401130576348 0.4003955160405419
  9 0.0120633394191050 0.0493029943471183 0.0260034524767398 0.4496985103876602
 10 0.0161716623600090 0.0486998273761630 0.0421751148367488 0.4983983377638232
 11 0.0207585511322367 0.0478912442581626 0.0629336659689856 0.5462895820219857
 12 0.0257765478740990 0.0468533167015507 0.0887102138430846 0.5931428987235364
 13 0.0311633730493854 0.0455644893078458 0.1198735868924700 0.6387073880313822
 14 0.0368391158442262 0.0440063206553765 0.1567127027366962 0.6827137086867586
 15 0.0427031802506397 0.0421643648631652 0.1994158829873359 0.7248780735499238
 16 0.0486309868367617 0.0400292058506332 0.2480468698240976 0.7649072794005570
 17 0.0544704302037592 0.0375976565087951 0.3025173000278568 0.8025049359093521
 18 0.0600380921294286 0.0348741349986072 0.3625553921572854 0.8373790709079593
 19 0.0651152103984763 0.0318722303921357 0.4276706025557617 0.8692513013000950
 20 0.0694434033194839 0.0286164698722119 0.4971140058752456 0.8978677711723069
 21 0.0727201499285570 0.0251442997062377 0.5698341558038026 0.9230120708785446
 22 0.0745940258796570 0.0215082922098099 0.6444281816834596 0.9445203630883545
 23 0.0746596950216180 0.0177785909158270 0.7190878767050776 0.9622989540041815
 24 0.0724526566618460 0.0140456061647461 0.7915405333669235 0.9763445601689277
 25 0.0674437485167025 0.0104229733316538 0.8589842818836260 0.9867675335005816
 26 0.0590334053485719 0.0070507859058187 0.9180176872321979 0.9938183194064003
 27 0.0465456732896124 0.0040991156383901 0.9645633605218102 0.9979174350447905
 28 0.0292219798521903 0.0017718319739095 0.9937853403740006 0.9996892670187000
 29 0.0062146596259995 0.0003107329813000 1.0000000000000000 1.0000000000000000

 ndtfast, nfast =   20  29   nfast/ndtfast = 1.45000

 Centers of gravity and integrals (values must be 1, 1, approx 1/2, 1, 1):

    1.000000000000 1.060707743385 0.530353871693 1.000000000000 1.000000000000

 Power filter parameters, Fgamma, gamma =  0.28400   0.14200

 Minimum X-grid spacing, DXmin =  3.26041548E+00 km
 Maximum X-grid spacing, DXmax =  6.14309257E+00 km
 Minimum Y-grid spacing, DYmin =  8.17650180E+00 km
 Maximum Y-grid spacing, DYmax =  8.17650180E+00 km
 Minimum Z-grid spacing, DZmin =  7.39187981E+00 m
 Maximum Z-grid spacing, DZmax =  3.60764219E+02 m

 Minimum barotropic Courant Number =  1.62217863E-01
 Maximum barotropic Courant Number =  4.49102838E-01
 Maximum Coriolis   Courant Number =  2.05618380E-02


 Maximum grid stiffness ratios:  rx0 =   3.966042E-01 (Beckmann and Haidvogel)
                                 rx1 =   1.150589E+01 (Haney)


 Initial basin volumes: TotVolume =  1.65993987026488E+17 m3
                        MinVolume =  1.97754634567622E+08 m3
                        MaxVolume =  1.80931355725002E+10 m3
                          Max/Min =  9.14928523018421E+01

NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 00000001 - 00017532)

   STEP   Day HH:MM:SS  KINETIC_ENRG   POTEN_ENRG    TOTAL_ENRG    NET_VOLUME

      0     0 00:00:00  0.000000E+00  1.963001E+04  1.963001E+04  1.659940E+17
      DEF_HIS   - creating history file: ./output/ocean_his_0001.nc

 NETCDF_CREATE - unable to create output NetCDF file:
                 ./output/ocean_his_0001.nc
                 call from:  def_his.F

 DEF_HIS - unable to create history NetCDF file: ./output/ocean_his_0001.nc

 Elapsed CPU time (seconds):

 Node   #  1 CPU:      31.808
 Node   #  2 CPU:      32.160
 Node   #  4 CPU:      32.197
 Node   #  5 CPU:      31.822
 Node   # 11 CPU:      31.789
 Node   # 12 CPU:      31.882
 Node   # 25 CPU:      33.336
 Node   #  6 CPU:      32.078
 Node   # 13 CPU:      31.872
 Node   # 14 CPU:      31.793
 Node   #  0 CPU:      31.172
 Node   # 23 CPU:      31.846
 Node   # 26 CPU:      33.336
 Node   # 29 CPU:      33.332
 Node   # 30 CPU:      33.331
 Node   # 27 CPU:      33.336
 Node   # 28 CPU:      33.333
 Node   # 54 CPU:      31.978
 Node   # 49 CPU:      31.975
 Node   # 60 CPU:      32.063
 Node   # 57 CPU:      32.026
 Node   # 55 CPU:      31.941
 Node   # 53 CPU:      31.979
 Node   #  3 CPU:      31.868
 Node   #  7 CPU:      31.855
 Node   # 15 CPU:      31.693
 Node   # 65 CPU:      31.800
 Node   # 31 CPU:      33.331
 Node   #  9 CPU:      31.755
 Node   # 10 CPU:      31.775
 Node   # 22 CPU:      31.938
 Node   # 20 CPU:      31.907
 Node   # 21 CPU:      31.957
 Node   # 19 CPU:      31.783
 Node   # 44 CPU:      31.846
 Node   # 46 CPU:      31.883
 Node   #129 CPU:      31.494
 Node   # 42 CPU:      31.783
 Node   # 41 CPU:      31.873
 Node   # 39 CPU:      31.805
 Node   # 33 CPU:      31.911
 Node   # 63 CPU:      32.075
 Node   # 43 CPU:      31.792
 Node   # 89 CPU:      32.033
 Node   # 45 CPU:      31.881
 Node   # 91 CPU:      32.070
 Node   #190 CPU:      32.082
 Node   # 47 CPU:      31.885
 Node   # 62 CPU:      32.001
 Node   # 59 CPU:      32.043
 Node   #121 CPU:      32.009
 Node   # 58 CPU:      32.149
 Node   # 51 CPU:      31.827
 Node   #110 CPU:      33.213
 Node   #107 CPU:      33.211
 Node   #109 CPU:      33.211
 Node   #222 CPU:      31.403
 Node   #193 CPU:      31.809
 Node   # 61 CPU:      32.011
 Node   #126 CPU:      32.028
 Node   #122 CPU:      31.995
 Node   #207 CPU:      31.959
 Node   #215 CPU:      31.736
 Node   #219 CPU:      31.738
 Node   #220 CPU:      31.809
 Node   #431 CPU:      31.703
 Node   #113 CPU:      31.989
 Node   #111 CPU:      33.210
 Node   #238 CPU:      31.553
 Node   #123 CPU:      32.027
 Node   #254 CPU:      31.913
 Node   #241 CPU:      31.688
 Node   #246 CPU:      31.609
 Node   #245 CPU:      31.638
 Node   #223 CPU:      31.461
 Node   #116 CPU:      31.771
 Node   #118 CPU:      31.925
 Node   #478 CPU:      31.256
 Node   #237 CPU:      31.629
 Node   #225 CPU:      31.638
 Node   #447 CPU:      31.764
 Node   # 52 CPU:      31.945
 Node   #103 CPU:      31.642
 Node   #415 CPU:      31.900
 Node   #221 CPU:      31.424
 Node   #441 CPU:      31.693
 Node   #442 CPU:      31.756
 Node   #439 CPU:      31.947
 Node   #863 CPU:      31.617
 Node   # 95 CPU:      31.989
 Node   #387 CPU:      31.503
 Node   #191 CPU:      31.961
 Node   #388 CPU:      31.443
 Node   #383 CPU:      31.667
 Node   #385 CPU:      31.403
 Node   #772 CPU:      31.824
 Node   #497 CPU:      31.855
 Node   #247 CPU:      31.674
 Node   #495 CPU:      31.912
 Node   #125 CPU:      32.028
 Node   #510 CPU:      31.882
 Node   #119 CPU:      31.858
 Node   #494 CPU:      31.801
 Node   #493 CPU:      31.887
 Node   #492 CPU:      31.807
 Node   #491 CPU:      31.840
 Node   #985 CPU:      31.758
 Node   #115 CPU:      31.797
 Node   #117 CPU:      31.788
 Node   #105 CPU:      33.209
 Node   #211 CPU:      31.822
 Node   #423 CPU:      31.420
 Node   #849 CPU:      31.791
 Node   #106 CPU:      33.209
 Node   #214 CPU:      31.739
 Node   #213 CPU:      31.608
 Node   #429 CPU:      31.683
 Node   #430 CPU:      31.621
 Node   #427 CPU:      31.729
 Node   #428 CPU:      31.583
 Node   #857 CPU:      32.005
 Node   #831 CPU:      31.872
 Node   #847 CPU:      31.935
 Node   #860 CPU:      31.894
 Node   #855 CPU:      31.864
 Node   #862 CPU:      31.750
 Node   #859 CPU:      31.991
 Node   #445 CPU:      31.691
 Node   #446 CPU:      31.766
 Node   #444 CPU:      31.714
 Node   #443 CPU:      31.773
 Node   #889 CPU:      31.966
 Node   #886 CPU:      31.740
 Node   #108 CPU:      33.211
 Node   #865 CPU:      31.460
 Node   # 50 CPU:      32.055
 Node   #102 CPU:      31.737
 Node   #101 CPU:      31.639
 Node   #206 CPU:      31.813
 Node   #414 CPU:      31.937
 Node   #413 CPU:      31.752
 Node   #205 CPU:      31.951
 Node   #204 CPU:      31.911
 Node   #409 CPU:      31.748
 Node   #410 CPU:      31.722
 Node   #203 CPU:      32.002
 Node   #407 CPU:      31.679
 Node   #412 CPU:      31.792
 Node   #411 CPU:      31.719
 Node   #825 CPU:      31.728
 Node   #828 CPU:      31.774
 Node   #817 CPU:      31.863
 Node   #815 CPU:      31.593
 Node   #823 CPU:      31.758
 Node   #826 CPU:      31.904
 Node   # 97 CPU:      31.568
 Node   #775 CPU:      31.652
 Node   #777 CPU:      31.845
 Node   #386 CPU:      31.490
 Node   #774 CPU:      31.851
 Node   #773 CPU:      31.609
 Node   #769 CPU:      31.536
 Node   #767 CPU:      31.691
 Node   #498 CPU:      31.905
 Node   #991 CPU:      31.421
 Node   #251 CPU:      31.927
 Node   #503 CPU:      31.827
 Node   #*** CPU:      31.937
 Node   #239 CPU:      31.617
 Node   #479 CPU:      31.289
 Node   #484 CPU:      31.805
 Node   #483 CPU:      31.721
 Node   #969 CPU:      31.740
 Node   #970 CPU:      31.914
 Node   #967 CPU:      31.636
 Node   #243 CPU:      31.633
 Node   #990 CPU:      31.391
 Node   #988 CPU:      31.806
 Node   #987 CPU:      31.812
 Node   #986 CPU:      31.492
 Node   #983 CPU:      31.719
 Node   #897 CPU:      31.553
 Node   #895 CPU:      31.938
 Node   #898 CPU:      31.602
 Node   #233 CPU:      31.547
 Node   #468 CPU:      31.578
 Node   #234 CPU:      31.556
 Node   #470 CPU:      31.615
 Node   #469 CPU:      31.618
 Node   #467 CPU:      31.582
 Node   #476 CPU:      31.640
 Node   #475 CPU:      31.575
 Node   #477 CPU:      31.574
 Node   #957 CPU:      31.650
 Node   #235 CPU:      31.718
 Node   #471 CPU:      31.626
 Node   #236 CPU:      31.553
 Node   #473 CPU:      31.600
 Node   #474 CPU:      31.354
 Node   #943 CPU:      32.020
 Node   #948 CPU:      31.839
 Node   #950 CPU:      31.874
 Node   #949 CPU:      31.967
 Node   #417 CPU:      31.734
 Node   #836 CPU:      32.048
 Node   #835 CPU:      32.024
 Node   #850 CPU:      31.700
 Node   #892 CPU:      31.847
 Node   #890 CPU:      31.971
 Node   #879 CPU:      31.498
 Node   #884 CPU:      31.599
 Node   #885 CPU:      31.700
 Node   #866 CPU:      31.681
 Node   #217 CPU:      31.742
 Node   #436 CPU:      32.015
 Node   #435 CPU:      32.013
 Node   #873 CPU:      31.450
 Node   #874 CPU:      31.476
 Node   #871 CPU:      31.646
 Node   #894 CPU:      31.994
 Node   #891 CPU:      31.987
 Node   #887 CPU:      31.649
 Node   #881 CPU:      31.807
 Node   #883 CPU:      31.601
 Node   #771 CPU:      31.546
 Node   #770 CPU:      31.882
 Node   #259 CPU:      31.370
 Node   #127 CPU:      31.955
 Node   #130 CPU:      31.846
 Node   #262 CPU:      31.747
 Node   #261 CPU:      31.442
 Node   #526 CPU:      31.775
 Node   #524 CPU:      31.874
 Node   #523 CPU:      31.740
 Node   #255 CPU:      31.894
 Node   #513 CPU:      31.483
 Node   #511 CPU:      31.849
 Node   #260 CPU:      31.731
 Node   #519 CPU:      31.415
 Node   #522 CPU:      31.761
 Node   #521 CPU:      31.724
 Node   #514 CPU:      31.882
 Node   #*** CPU:      31.672
 Node   #525 CPU:      31.792
 Node   #257 CPU:      31.383
 Node   #516 CPU:      31.739
 Node   #515 CPU:      31.511
 Node   #256 CPU:      31.319
 Node   # 81 CPU:      32.236
 Node   # 79 CPU:      32.126
 Node   # 83 CPU:      32.262
 Node   # 90 CPU:      32.162
 Node   #180 CPU:      32.126
 Node   # 94 CPU:      31.939
 Node   #189 CPU:      31.899
 Node   #164 CPU:      32.140
 Node   #159 CPU:      32.190
 Node   # 85 CPU:      32.360
 Node   #181 CPU:      32.091
 Node   #364 CPU:      31.716
 Node   #182 CPU:      32.075
 Node   #366 CPU:      31.761
 Node   #179 CPU:      31.995
 Node   #361 CPU:      31.694
 Node   #362 CPU:      31.716
 Node   #175 CPU:      31.999
 Node   #351 CPU:      31.431
 Node   #183 CPU:      32.071
 Node   #367 CPU:      31.778
 Node   #382 CPU:      31.751
 Node   #380 CPU:      31.727
 Node   #379 CPU:      31.751
 Node   #369 CPU:      31.536
 Node   #161 CPU:      32.196
 Node   #324 CPU:      31.844
 Node   #321 CPU:      31.470
 Node   #319 CPU:      33.224
 Node   #363 CPU:      31.842
 Node   #359 CPU:      31.545
 Node   # 87 CPU:      32.353
 Node   #381 CPU:      31.734
 Node   # 92 CPU:      32.052
 Node   #737 CPU:      31.699
 Node   #258 CPU:      31.718
 Node   #518 CPU:      31.733
 Node   #517 CPU:      31.407
 Node   #128 CPU:      31.588
 Node   #132 CPU:      31.792
 Node   #131 CPU:      31.544
 Node   #265 CPU:      31.879
 Node   #266 CPU:      31.829
 Node   #263 CPU:      31.461
 Node   #534 CPU:      31.968
 Node   #527 CPU:      31.652
 Node   #532 CPU:      31.901
 Node   #531 CPU:      31.789
 Node   #529 CPU:      31.954
 Node   #530 CPU:      31.962
 Node   #533 CPU:      31.936
 Node   #264 CPU:      31.735
 Node   # 64 CPU:      31.844
 Node   #134 CPU:      31.817
 Node   #270 CPU:      31.886
 Node   #269 CPU:      31.930
 Node   #542 CPU:      31.810
 Node   #539 CPU:      31.769
 Node   #541 CPU:      31.853
 Node   #540 CPU:      31.905
 Node   # 66 CPU:      31.942
 Node   #133 CPU:      31.524
 Node   #268 CPU:      32.000
 Node   #267 CPU:      31.873
 Node   #537 CPU:      31.816
 Node   #535 CPU:      31.863
 Node   #538 CPU:      31.862
 Node   #512 CPU:      31.414
 Node   # 32 CPU:      31.949
 Node   #167 CPU:      32.068
 Node   #335 CPU:      31.624
 Node   #337 CPU:      31.854
 Node   #671 CPU:      32.244
 Node   #174 CPU:      31.955
 Node   #350 CPU:      31.453
 Node   # 86 CPU:      32.339
 Node   #173 CPU:      32.035
 Node   #348 CPU:      31.715
 Node   #349 CPU:      31.701
 Node   #347 CPU:      31.749
 Node   #697 CPU:      31.678
 Node   #698 CPU:      31.677
 Node   #171 CPU:      31.970
 Node   #343 CPU:      31.949
 Node   #172 CPU:      32.062
 Node   #345 CPU:      31.721
 Node   #346 CPU:      31.829
 Node   #365 CPU:      31.749
 Node   #734 CPU:      31.303
 Node   #726 CPU:      31.840
 Node   #719 CPU:      31.681
 Node   #721 CPU:      31.818
 Node   #703 CPU:      31.672
 Node   #641 CPU:      31.516
 Node   #644 CPU:      31.822
 Node   #643 CPU:      31.581
 Node   #646 CPU:      31.868
 Node   #642 CPU:      31.896
 Node   #639 CPU:      31.467
 Node   #673 CPU:      31.826
 Node   #687 CPU:      31.863
 Node   #694 CPU:      31.762
 Node   #692 CPU:      31.644
 Node   #761 CPU:      31.690
 Node   #759 CPU:      31.517
 Node   #766 CPU:      31.548
 Node   #763 CPU:      31.575
 Node   #764 CPU:      31.680
 Node   #765 CPU:      31.679
 Node   #185 CPU:      31.910
 Node   #372 CPU:      31.373
 Node   #735 CPU:      31.394
 Node   #738 CPU:      31.821
 Node   #729 CPU:      31.651
 Node   #730 CPU:      31.438
 Node   #727 CPU:      31.882
 Node   #731 CPU:      31.614
 Node   #732 CPU:      31.724
 Node   #733 CPU:      31.330
 Node   #724 CPU:      31.820
 Node   #725 CPU:      31.708
 Node   #722 CPU:      31.760
 Node   #353 CPU:      31.413
 Node   #708 CPU:      31.830
 Node   #707 CPU:      31.832
 Node   #705 CPU:      31.726
 Node   #723 CPU:      31.790
 Node   #360 CPU:      31.710
 Node   #706 CPU:      31.821
 Node   #322 CPU:      31.844
 Node   #645 CPU:      31.584
 Node   #649 CPU:      31.814
 Node   #650 CPU:      31.750
 Node   #647 CPU:      31.988
 Node   #323 CPU:      31.635
 Node   # 84 CPU:      32.359
 Node   #338 CPU:      31.869
 Node   #677 CPU:      31.776
 Node   #678 CPU:      31.752
 Node   #702 CPU:      31.671
 Node   #699 CPU:      31.690
 Node   #700 CPU:      31.932
 Node   #701 CPU:      31.815
 Node   #689 CPU:      31.606
 Node   #690 CPU:      31.857
 Node   #693 CPU:      31.844
 Node   #691 CPU:      31.656
 Node   #169 CPU:      32.034
 Node   #339 CPU:      31.875
 Node   #679 CPU:      31.615
 Node   #170 CPU:      32.031
 Node   #342 CPU:      31.958
 Node   #341 CPU:      31.953
 Node   #686 CPU:      31.784
 Node   #684 CPU:      31.873
 Node   #683 CPU:      31.790
 Node   #340 CPU:      31.954
 Node   #685 CPU:      31.847
 Node   #674 CPU:      31.824
 Node   #676 CPU:      31.781
 Node   #675 CPU:      31.823
 Node   #762 CPU:      31.563
 Node   #371 CPU:      31.407
 Node   #745 CPU:      31.561
 Node   #746 CPU:      31.585
 Node   #186 CPU:      31.957
 Node   #374 CPU:      31.531
 Node   #750 CPU:      31.556
 Node   #749 CPU:      31.482
 Node   #373 CPU:      31.448
 Node   #370 CPU:      31.569
 Node   #741 CPU:      31.847
 Node   #742 CPU:      31.826
 Node   #177 CPU:      31.824
 Node   #356 CPU:      31.637
 Node   #713 CPU:      31.573
 Node   #714 CPU:      31.745
 Node   #355 CPU:      31.514
 Node   #711 CPU:      31.690
 Node   #709 CPU:      31.697
 Node   #354 CPU:      31.587
 Node   #710 CPU:      31.669
 Node   #160 CPU:      32.209
 Node   #325 CPU:      31.510
 Node   #651 CPU:      31.790
 Node   #652 CPU:      31.863
 Node   #695 CPU:      31.763
 Node   #681 CPU:      31.857
 Node   #682 CPU:      31.749
 Node   #168 CPU:      31.849
 Node   #162 CPU:      32.185
 Node   #326 CPU:      31.780
 Node   #654 CPU:      31.652
 Node   #653 CPU:      31.765
 Node   # 93 CPU:      31.963
 Node   #743 CPU:      31.861
 Node   #747 CPU:      31.594
 Node   #748 CPU:      31.401
 Node   #739 CPU:      31.770
 Node   #740 CPU:      31.864
 Node   #184 CPU:      32.099
 Node   #358 CPU:      31.639
 Node   #718 CPU:      31.639
 Node   #717 CPU:      31.791
 Node   #178 CPU:      32.141
 Node   #357 CPU:      31.540
 Node   #715 CPU:      31.872
 Node   #716 CPU:      31.738
 Node   #176 CPU:      32.043
 Node   #704 CPU:      31.643
 Node   #720 CPU:      31.719
 Node   #696 CPU:      31.950
 Node   #680 CPU:      31.481
 Node   #672 CPU:      31.839
 Node   #640 CPU:      31.608
 Node   #163 CPU:      32.182
 Node   #327 CPU:      31.838
 Node   #655 CPU:      31.831
 Node   #329 CPU:      31.495
 Node   #330 CPU:      31.548
 Node   #660 CPU:      31.864
 Node   #661 CPU:      31.747
 Node   #662 CPU:      31.817
 Node   #657 CPU:      31.804
 Node   #658 CPU:      31.912
 Node   #328 CPU:      31.392
 Node   #659 CPU:      31.728
 Node   #656 CPU:      31.779
 Node   # 80 CPU:      32.139
 Node   #187 CPU:      31.957
 Node   #751 CPU:      31.639
 Node   #375 CPU:      31.458
 Node   #753 CPU:      31.657
 Node   #754 CPU:      31.632
 Node   #188 CPU:      32.040
 Node   #377 CPU:      31.734
 Node   #378 CPU:      31.851
 Node   #758 CPU:      31.422
 Node   #756 CPU:      31.760
 Node   #757 CPU:    Node   # 67 CPU:      31.853
 Node   #135 CPU:      31.959
 Node   #271 CPU:      31.804
 Node   #543 CPU:      31.878
 Node   #545 CPU:      31.450
 Node   #546 CPU:      31.444
 Node   # 68 CPU:      31.740
 Node   #273 CPU:      33.202
 Node   #548 CPU:      31.365
 Node   #547 CPU:      31.470
 Node   #274 CPU:      33.204
 Node   #550 CPU:      31.415
 Node   #549 CPU:      31.344
 Node   #136 CPU:      32.037
 Node   #137 CPU:      32.116
 Node   #275 CPU:      33.202
 Node   #276 CPU:      33.204
 Node   #551 CPU:      31.291
 Node   #554 CPU:      31.576
 Node   #553 CPU:      31.830
 Node   #138 CPU:      32.055
 Node   #278 CPU:      33.201
 Node   #277 CPU:      33.200
 Node   #558 CPU:      31.798
 Node   #556 CPU:      31.592
 Node   #557 CPU:      31.799
 Node   #555 CPU:      31.622
   31.466
 Node   #755 CPU:      31.445
 Node   #736 CPU:      31.653
 Node   #744 CPU:      31.671
 Node   #752 CPU:      31.524
 Node   #712 CPU:      31.617
 Node   #728 CPU:      31.438
 Node   #688 CPU:      31.616
 Node   #336 CPU:      31.863
 Node   #648 CPU:      31.734
 Node   #320 CPU:      31.445
 Node   #368 CPU:      31.221
 Node   #760 CPU:      31.499
 Node   #376 CPU:      31.707
 Node   # 88 CPU:      32.070
 Node   #352 CPU:      31.491
 Node   # 17 CPU:      31.976
 Node   #344 CPU:      31.626
 Node   #166 CPU:      32.156
 Node   #334 CPU:      31.408
 Node   #333 CPU:      31.681
 Node   # 82 CPU:      32.253
 Node   #165 CPU:      32.128
 Node   #331 CPU:      31.539
 Node   #332 CPU:      31.531
 Node   #670 CPU:      32.289
 Node   #669 CPU:      32.138
 Node   #665 CPU:      32.140
 Node   #666 CPU:      32.158
 Node   #663 CPU:      31.736
 Node   #668 CPU:      32.191
 Node   #667 CPU:      32.239
 Node   #664 CPU:      32.191
 Node   # 40 CPU:      31.743
 Node   # 34 CPU:      31.920
 Node   # 70 CPU:      31.944
 Node   # 69 CPU:      31.830
 Node   #142 CPU:      31.944
 Node   #139 CPU:      32.078
 Node   #279 CPU:      33.203
 Node   #141 CPU:      32.050
 Node   #286 CPU:      33.217
 Node   #285 CPU:      33.220
 Node   #283 CPU:      33.219
 Node   #284 CPU:      33.220
 Node   #569 CPU:      31.567
 Node   #559 CPU:      31.808
 Node   #574 CPU:      31.559
 Node   #561 CPU:      31.838
 Node   #570 CPU:      31.703
 Node   #571 CPU:      31.528
 Node   #567 CPU:      31.823
 Node   #140 CPU:      32.156
 Node   #562 CPU:      31.905
 Node   #573 CPU:      31.496
 Node   #572 CPU:      31.619
 Node   #281 CPU:      33.217
 Node   #564 CPU:      31.803
 Node   #563 CPU:      31.673
 Node   #280 CPU:      33.219
 Node   #282 CPU:      33.217
 Node   #566 CPU:      31.787
 Node   #565 CPU:      31.851
 Node   #560 CPU:      31.595
 Node   # 36 CPU:      31.835
 Node   # 35 CPU:      31.948
 Node   # 73 CPU:      31.905
 Node   # 74 CPU:      32.082
 Node   # 71 CPU:      31.864
 Node   #148 CPU:      32.045
 Node   #147 CPU:      31.907
 Node   #149 CPU:      31.940
 Node   #150 CPU:      32.014
 Node   #143 CPU:      32.116
 Node   #297 CPU:      33.220
 Node   #298 CPU:      33.220
 Node   #295 CPU:      33.300
 Node   #302 CPU:      33.224
 Node   #301 CPU:      33.221
 Node   #300 CPU:      33.221
 Node   #299 CPU:      33.220
 Node   #287 CPU:      33.220
 Node   #598 CPU:      31.578
 Node   #601 CPU:      31.801
 Node   #606 CPU:      31.664
 Node   #577 CPU:      31.644
 Node   #591 CPU:      31.773
 Node   #593 CPU:      31.490
 Node   #596 CPU:      31.656
 Node   #602 CPU:      31.940
 Node   #599 CPU:      31.682
 Node   #604 CPU:      31.830
 Node   #603 CPU:      31.834
 Node   #289 CPU:      33.294
 Node   #575 CPU:      31.618
 Node   #578 CPU:      31.940
 Node   #594 CPU:      31.500
 Node   #605 CPU:      31.846
 Node   #290 CPU:      33.300
 Node   #582 CPU:      32.004
 Node   #581 CPU:      31.665
 Node   #597 CPU:      31.652
 Node   #595 CPU:      31.510
 Node   #296 CPU:      33.221
 Node   #288 CPU:      33.296
 Node   #580 CPU:      31.581
 Node   #579 CPU:      31.661
 Node   #144 CPU:      31.932
 Node   #145 CPU:      31.958
 Node   #292 CPU:      33.301
 Node   #291 CPU:      33.297
 Node   #586 CPU:      31.729
 Node   #585 CPU:      31.563
 Node   #583 CPU:      31.663
 Node   #146 CPU:      32.115
 Node   #294 CPU:      33.300
 Node   #293 CPU:      33.303
 Node   #590 CPU:      31.684
 Node   #588 CPU:      31.715
 Node   #587 CPU:      31.721
 Node   #589 CPU:      31.861
 Node   # 72 CPU:      31.951
 Node   #576 CPU:      31.686
 Node   #592 CPU:      31.394
 Node   #584 CPU:      31.610
 Node   # 18 CPU:      31.944
 Node   # 37 CPU:      31.852
 Node   # 76 CPU:      32.064
 Node   # 75 CPU:      32.079
 Node   #153 CPU:      32.091
 Node   #154 CPU:      32.105
 Node   #151 CPU:      31.935
 Node   #310 CPU:      33.222
 Node   # 38 CPU:      31.927
 Node   # 78 CPU:      31.929
 Node   #158 CPU:      32.214
 Node   #157 CPU:      32.115
 Node   #318 CPU:      33.226
 Node   #317 CPU:      33.224
 Node   #303 CPU:      33.220
 Node   #308 CPU:      33.224
 Node   #316 CPU:      33.223
 Node   #315 CPU:      33.227
 Node   #638 CPU:      31.543
 Node   #607 CPU:      31.711
 Node   #309 CPU:      33.225
 Node   #622 CPU:      31.569
 Node   #307 CPU:      33.228
 Node   #618 CPU:      31.669
 Node   #617 CPU:      31.674
 Node   #620 CPU:      31.635
 Node   #619 CPU:      31.728
 Node   #633 CPU:      31.633
 Node   #636 CPU:      31.541
 Node   #634 CPU:      31.650
 Node   #631 CPU:      31.533
 Node   #305 CPU:      33.228
 Node   #612 CPU:      31.591
 Node   #611 CPU:      31.539
 Node   #615 CPU:      31.584
 Node   #635 CPU:      31.572
 Node   #637 CPU:      31.540
 Node   #609 CPU:      31.506
 Node   #610 CPU:      31.635
 Node   #621 CPU:      31.589
 Node   # 77 CPU:      32.170
 Node   #306 CPU:      33.228
 Node   #614 CPU:      31.670
 Node   #613 CPU:      31.669
 Node   #304 CPU:      33.228
 Node   #152 CPU:      32.117
 Node   #155 CPU:      32.140
 Node   #311 CPU:      33.225
 Node    Node   #544 CPU:      31.502
 Node   #552 CPU:      31.353
#623 CPU:      31.660
 Node   #625 CPU:      31.674
 Node   #626 CPU:      31.606
 Node   #312 CPU:      33.225
 Node   #156 CPU:      32.146
 Node   #313 CPU:      33.226
 Node   #314 CPU:      33.229
 Node   #630 CPU:      31.622
 Node   #628 CPU:      31.468
 Node   #629 CPU:      31.487
 Node   #627 CPU:      31.491
 Node   #624 CPU:      31.346
 Node   #632 CPU:      31.616
 Node   #608 CPU:      31.480
 Node   #600 CPU:      31.647
 Node   #568 CPU:      31.615
 Node   #536 CPU:      31.479
 Node   #  8 CPU:      31.652
 Node   #616 CPU:      31.637
 Node   # 16 CPU:      31.743
 Node   #520 CPU:      31.674
 Node   #272 CPU:      33.204
 Node   #528 CPU:      31.785
 Node   #996 CPU:      31.940
 Node   #997 CPU:      31.919
 Node   #998 CPU:      31.925
 Node   #993 CPU:      31.819
 Node   #509 CPU:      31.851
 Node   #252 CPU:      31.909
 Node   #*** CPU:      31.621
 Node   #959 CPU:      31.729
 Node   #481 CPU:      31.666
 Node   #964 CPU:      31.734
 Node   #963 CPU:      31.774
 Node   #961 CPU:      31.697
 Node   #962 CPU:      31.714
 Node   #487 CPU:      31.837
 Node   #975 CPU:      31.920
 Node   #977 CPU:      31.689
 Node   #249 CPU:      31.909
 Node   #499 CPU:      31.836
 Node   #999 CPU:      31.936
 Node   #500 CPU:      31.906
 Node   #*** CPU:      31.887
 Node   #*** CPU:      31.907
 Node   #*** CPU:      31.672
 Node   #*** CPU:      31.670
 Node   #*** CPU:      31.670
 Node   #*** CPU:      31.664
 Node   #505 CPU:      31.879
 Node   #*** CPU:      31.701
 Node   #*** CPU:      31.536
 Node   #506 CPU:      31.889
 Node   #*** CPU:      31.658
 Node   #*** CPU:      31.700
 Node   #*** CPU:      31.853
 Node   #833 CPU:      31.875
 Node   #418 CPU:      31.580
 Node   #838 CPU:      32.004
 Node   #837 CPU:      31.921
 Node   #858 CPU:      31.985
 Node   #861 CPU:      32.003
 Node   #424 CPU:      31.264
 Node   #834 CPU:      32.028
 Node   #416 CPU:      31.596
 Node   #433 CPU:      31.769
 Node   #868 CPU:      31.868
 Node   #867 CPU:      31.529
 Node   #893 CPU:      31.994
 Node   #882 CPU:      31.835
 Node   #440 CPU:      31.767
 Node   #201 CPU:      31.798
 Node   #403 CPU:      31.641
 Node   #404 CPU:      31.773
 Node   #807 CPU:      31.364
 Node   #809 CPU:      31.664
 Node   #810 CPU:      31.815
 Node   #202 CPU:      31.919
 Node   #405 CPU:      31.680
 Node   #406 CPU:      31.746
 Node   #812 CPU:      31.470
 Node   #814 CPU:      31.889
 Node   #813 CPU:      31.579
 Node   #811 CPU:      31.718
 Node   #100 CPU:      31.656
 Node   #199 CPU:      31.785
 Node   #399 CPU:      31.924
 Node   #799 CPU:      31.784
 Node   #801 CPU:      31.809
 Node   #802 CPU:      31.804
 Node   #800 CPU:      31.893
 Node   #212 CPU:      31.707
 Node   #208 CPU:      31.637
 Node   #218 CPU:      31.501
 Node   #438 CPU:      31.955
 Node   #437 CPU:      31.884
 Node   #878 CPU:      31.498
 Node   #876 CPU:      31.440
 Node   #877 CPU:      31.440
 Node   #875 CPU:      31.576
 Node   #768 CPU:      31.536
 Node   #192 CPU:      31.842
 Node   #384 CPU:      31.501
 Node   #822 CPU:      31.767
 Node   #820 CPU:      31.727
 Node   #818 CPU:      31.905
 Node   #829 CPU:      31.813
 Node   #827 CPU:      31.845
 Node   #830 CPU:      31.808
 Node   #821 CPU:      31.892
 Node   #819 CPU:      31.615
 Node   #408 CPU:      31.616
 Node   #816 CPU:      31.655
 Node   #824 CPU:      31.827
 Node   #425 CPU:      31.809
 Node   #852 CPU:      31.860
 Node   #851 CPU:      31.715
 Node   #426 CPU:      31.555
 Node   #854 CPU:      31.818
 Node   #853 CPU:      31.863
 Node   #210 CPU:      31.645
 Node   #422 CPU:      31.506
 Node   #421 CPU:      31.557
 Node   #846 CPU:      31.645
 Node   #845 CPU:      31.922
 Node   #843 CPU:      31.807
 Node   #844 CPU:      31.921
 Node   #432 CPU:      31.894
 Node   #434 CPU:      31.982
 Node   #870 CPU:      31.684
 Node   #869 CPU:      31.601
 Node   #872 CPU:      31.451
 Node   #216 CPU:      31.519
 Node   #778 CPU:      31.856
 Node   #776 CPU:      31.751
 Node   #195 CPU:      31.947
 Node   #196 CPU:      31.724
 Node   #391 CPU:      31.457
 Node   #783 CPU:      31.806
 Node   #785 CPU:      31.658
 Node   #786 CPU:      31.833
 Node   #393 CPU:      31.862
 Node   #787 CPU:      31.810
 Node   #788 CPU:      31.827
 Node   #394 CPU:      31.824
 Node   #790 CPU:      31.744
 Node   #789 CPU:      31.825
 Node   #392 CPU:      31.792
 Node   #808 CPU:      31.247
 Node   # 99 CPU:      31.610
 Node   #401 CPU:      31.683
 Node   #804 CPU:      31.701
 Node   #803 CPU:      31.780
 Node   #402 CPU:      31.765
 Node   #806 CPU:      31.654
 Node   #805 CPU:      31.656
 Node   #200 CPU:      31.840
 Node   #400 CPU:      31.785
 Node   #209 CPU:      31.727
 Node   #420 CPU:      31.550
 Node   #419 CPU:      31.731
 Node   #841 CPU:      31.644
 Node   #842 CPU:      31.930
 Node   #839 CPU:      32.046
 Node   #840 CPU:      31.668
 Node   #104 CPU:      33.209
 Node   #880 CPU:      31.482
 Node   #194 CPU:      31.812
 Node   #389 CPU:      31.466
 Node   #779 CPU:      31.879
 Node   #780 CPU:      31.960
 Node   #390 CPU:      31.497
 Node   #782 CPU:      31.951
 Node   #781 CPU:      31.906
 Node   # 98 CPU:      31.695
 Node   #197 CPU:      31.776
 Node   #396 CPU:      31.942
 Node   #395 CPU:      31.867
 Node   #793 CPU:      31.701
 Node   #794 CPU:      31.749
 Node   #791 CPU:      31.634
 Node   #792 CPU:      31.425
 Node   #198 CPU:      31.875
 Node   #397 CPU:      31.849
 Node   #398 CPU:      31.740
 Node   #796 CPU:      31.840
 Node   #795 CPU:      31.824
 Node   #798 CPU:      31.837
 Node   #797 CPU:      31.840
 Node   # 24 CPU:      33.327
 Node   #848 CPU:      31.714
 Node   # 96 CPU:      31.518
 Node   # 48 CPU:      31.810
 Node   #832 CPU:      31.838
 Node   #864 CPU:      31.542
 Node   #856 CPU:      31.753
 Node   #888 CPU:      31.941
 Node   #784 CPU:      31.714
 Node   #452 CPU:      31.576
 Node   #451 CPU:      31.496
 Node   #903 CPU:      31.593
 Node   #905 CPU:      31.948
 Node   #906 CPU:      31.939
 Node   #231 CPU:      31.774
 Node   #463 CPU:      31.686
 Node   #927 CPU:      31.842
 Node   #929 CPU:      31.792
 Node   #942 CPU:      31.999
 Node   #941 CPU:      32.014
 Node   #939 CPU:      31.989
 Node   #940 CPU:      31.994
 Node   #937 CPU:      32.010
 Node   #938 CPU:      31.937
 Node   #935 CPU:      31.617
 Node   #953 CPU:      31.662
 Node   #951 CPU:      31.903
 Node   #956 CPU:      31.696
 Node   #955 CPU:      31.763
 Node   #958 CPU:      31.744
 Node   #945 CPU:      31.664
 Node   #946 CPU:      31.933
 Node   #947 CPU:      31.819
 Node   #449 CPU:      31.397
 Node   #448 CPU:      31.605
 Node   #454 CPU:      31.462
 Node   #909 CPU:      31.971
 Node   #226 CPU:      31.724
 Node   #453 CPU:      31.365
 Node   #907 CPU:      31.943
 Node   #908 CPU:      31.978
 Node   #910 CPU:      31.822
 Node   #482 CPU:      31.776
 Node   #966 CPU:      31.728
 Node   #965 CPU:      31.663
 Node   #960 CPU:      31.699
 Node   #995 CPU:      31.897
 Node   #994 CPU:      31.894
 Node   #250 CPU:      32.029
 Node   #502 CPU:      31.811
 Node   #501 CPU:      31.816
 Node   #*** CPU:      31.871
 Node   #*** CPU:      31.945
 Node   #*** CPU:      31.916
 Node   #*** CPU:      31.875
 Node   #465 CPU:      31.608
 Node   #930 CPU:      31.781
 Node   #936 CPU:      31.667
 Node   #899 CPU:      31.599
 Node   #900 CPU:      31.556
 Node   #450 CPU:      31.559
 Node   #902 CPU:      31.618
 Node   #901 CPU:      31.610
 Node   #224 CPU:      31.528
 Node   #112 CPU:      31.656
 Node   #954 CPU:      31.733
 Node   #952 CPU:      31.737
 Node   #932 CPU:      31.740
 Node   #931 CPU:      31.769
 Node   #466 CPU:      31.679
 Node   #934 CPU:      31.786
 Node   #933 CPU:      31.754
 Node   #928 CPU:      31.810
 Node   #232 CPU:      31.653
 Node   #896 CPU:      31.640
 Node   #989 CPU:      31.729
 Node   #244 CPU:      31.748
 Node   #978 CPU:      32.033
 Node   #490 CPU:      31.844
 Node   #982 CPU:      31.700
 Node   #981 CPU:      31.708
 Node   #489 CPU:      31.817
 Node   #980 CPU:      31.667
 Node   #979 CPU:      31.665
 Node   #242 CPU:      31.790
 Node   #485 CPU:      31.787
 Node   #971 CPU:      31.837
 Node   #972 CPU:      31.890
 Node   #486 CPU:      31.772
 Node   #974 CPU:      31.795
 Node   #973 CPU:      31.896
 Node   #240 CPU:      31.586
 Node   #480 CPU:      31.599
 Node   #253 CPU:      31.906
 Node   #504 CPU:      31.806
 Node   #248 CPU:      31.839
 Node   #124 CPU:      32.018
 Node   #*** CPU:      31.950
 Node   #507 CPU:      31.891
 Node   #*** CPU:      31.601
 Node   #*** CPU:      31.586
 Node   #508 CPU:      31.857
 Node   #*** CPU:      31.671
 Node   #*** CPU:      31.663
 Node   #*** CPU:      31.488
 Node   #472 CPU:      31.358
 Node   #464 CPU:      31.492
 Node   #944 CPU:      31.758
 Node   #984 CPU:      31.511
 Node   #488 CPU:      31.909
 Node   #968 CPU:      31.807
 Node   #120 CPU:      31.973
 Node   #976 CPU:      31.591
 Node   #496 CPU:      31.786
 Node   #992 CPU:      31.790
 Node   #455 CPU:      31.369
 Node   #457 CPU:      31.470
 Node   #227 CPU:      31.679
 Node   #911 CPU:      32.001
 Node   #913 CPU:      31.863
 Node   #228 CPU:      31.787
 Node   #914 CPU:      31.964
 Node   #456 CPU:      31.667
 Node   #915 CPU:      31.838
 Node   #916 CPU:      31.935
 Node   #458 CPU:      31.836
 Node   #918 CPU:      31.763
 Node   #917 CPU:      31.750
 Node   #912 CPU:      31.919
 Node   #904 CPU:      31.886
 Node   #229 CPU:      31.774
 Node   #459 CPU:      31.664
 Node   #460 CPU:      31.846
 Node   #230 CPU:      31.747
 Node   #461 CPU:      31.679
 Node   #924 CPU:      31.797
 Node   #462 CPU:      31.520
 Node   #925 CPU:      31.763
 Node   #114 CPU:      32.003
 Node   #919 CPU:      31.841
 Node   #921 CPU:      31.811
 Node   #922 CPU:      31.825
 Node   #926 CPU:      31.893
 Node   #923 CPU:      31.749
 Node   #920 CPU:      31.637
 Node   # 56 CPU:      32.085
 Total:             32621.340

 Nonlinear model elapsed time profile:

  Initialization ...................................     18360.049  (56.2823 %)
  Processing of input data .........................      1015.436  ( 3.1128 %)
  Computation of vertical boundary conditions ......        87.350  ( 0.2678 %)
  Computation of global information integrals ......      1880.285  ( 5.7640 %)
  2D/3D coupling, vertical metrics .................       486.022  ( 1.4899 %)
  Omega vertical velocity ..........................       375.724  ( 1.1518 %)
  Equation of state for seawater ...................      1370.897  ( 4.2025 %)
  Atmosphere-Ocean bulk flux parameterization ......       327.061  ( 1.0026 %)
  KPP vertical mixing parameterization .............       364.822  ( 1.1184 %)
                                              Total:     24267.645   74.3919

 Nonlinear model message Passage profile:

  Message Passage: 2D halo exchanges ...............      7772.868  (23.8276 %)
  Message Passage: 3D halo exchanges ...............      3648.635  (11.1848 %)
  Message Passage: 4D halo exchanges ...............       442.725  ( 1.3572 %)
  Message Passage: data broadcast ..................      2629.493  ( 8.0607 %)
  Message Passage: data reduction ..................     12382.207  (37.9574 %)
                                              Total:     26875.928   82.3876

 All percentages are with respect to total time =        32621.340

 ROMS/TOMS - Output NetCDF summary for Grid 01:

 Analytical header files used:

     ROMS/Functionals/ana_btflux.h
     ROMS/Functionals/ana_cloud.h
     ROMS/Functionals/ana_grid.h
     ROMS/Functionals/ana_humid.h
     ROMS/Functionals/ana_initial.h
     ROMS/Functionals/ana_pair.h
     ROMS/Functionals/ana_rain.h
     ROMS/Functionals/ana_srflux.h
     ROMS/Functionals/ana_stflux.h
     ROMS/Functionals/ana_tair.h
     ROMS/Functionals/ana_winds.h

 ROMS/TOMS - Output error ............ exit_flag:   3

 ERROR: Abnormal termination: NetCDF OUTPUT.
 REASON: No such file or directory    



PS:

Read file <benchmark4_err_file> for stderr output of this job.

ce107
Posts: 10
Joined: Tue Jul 01, 2003 10:31 am
Location: MIT,EAPS

Re: ROMS not running on 1024 cores for 17532 iterations

#2 Post by ce107 » Sat Nov 13, 2010 3:18 am

well it looks as if you're having trouble creating your history file - does it work fine for less processor cores? Are you using parallel NetCDF output?

In any case, your benchmark numbers so far show that you're spending (even for this number of timesteps) an inordinate amount of time for initialization - more than 59% of the total time is for startup activities which tends to show serial bottlenecks at that part of ROMS.

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#3 Post by prakrati » Sat Nov 13, 2010 3:40 am

Yes it is working fine with less processor cores.I am using netCDF 3.6.2 so I guess it doesnt support parallel output.So what should I do to make it run successfully. Please help

User avatar
shchepet
Posts: 185
Joined: Fri Nov 14, 2003 4:57 pm

Re: ROMS not running on 1024 cores for 17532 iterations

#4 Post by shchepet » Sat Nov 13, 2010 4:35 am

Did you create directory called "output" within your working/scratch
directory where you running your job?
.....
STEP Day HH:MM:SS KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME

0 0 00:00:00 0.000000E+00 1.963001E+04 1.963001E+04 1.659940E+17
DEF_HIS - creating history file: ./output/ocean_his_0001.nc

NETCDF_CREATE - unable to create output NetCDF file:
./output/ocean_his_0001.nc
call from: def_his.F
.....
I guess that you are running this job on some sort of supercomputer
center environment where you have to create/submit a job batch script
file which changes to scratch directory, copies your executable and your
roms.in file, your initial/forcing files, if any, then execute mpirun
or mpiexec roms. Because your roms.in file directs you code to place
output history file into directory called ./output, the directory MUST
exist before you run mpiexec. So add

Code: Select all

 mkdir output 

into your batch script just before you run mpirun/mpiexec and see
whether this corrects the problem.

Report the outcome back to this board. You have several other issues
which needs to be addressed, but lets do them one-at-a-time.

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#5 Post by prakrati » Sat Nov 13, 2010 4:47 am

Thanks for your quick response
I already created output directory and it is running for lesser iterations for the same ranks .So I feel the problem is not with the output file.Please guide

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#6 Post by prakrati » Sat Nov 13, 2010 5:57 am

I did as you told and included mkdir output in the batch script so now its giving me the following error with 1024 ranks for 17532 iterations.Please help


Initial basin volumes: TotVolume = 1.65993987026488E+17 m3
MinVolume = 1.97754634567622E+08 m3
MaxVolume = 1.80931355725002E+10 m3
Max/Min = 9.14928523018421E+01

NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 00000001 - 00017532)

STEP Day HH:MM:SS KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME

0 0 00:00:00 0.000000E+00 1.963001E+04 1.963001E+04 1.659940E+17
DEF_HIS - creating history file: ./output/ocean_his_0001.nc
rank 447 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 447: killed by signal 9
rank 528 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 528: killed by signal 9
rank 451 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 451: killed by signal 9
rank 448 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 448: killed by signal 9
rank 266 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 266: killed by signal 9
rank 274 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 274: killed by signal 9
rank 290 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 290: killed by signal 9
rank 288 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 288: killed by signal 9
rank 195 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 195: killed by signal 9
rank 194 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 194: killed by signal 9
rank 192 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 192: killed by signal 9
rank 235 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 235: killed by signal 9
rank 232 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 232: killed by signal 9
rank 319 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 319: killed by signal 9
rank 616 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 616: killed by signal 9
rank 629 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 629: killed by signal 9
rank 628 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 628: killed by signal 9
rank 625 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 625: killed by signal 9
rank 624 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 624: killed by signal 9
rank 831 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 831: killed by signal 9
rank 824 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 824: killed by signal 9
rank 554 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 554: killed by signal 9
rank 552 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 552: killed by signal 9
rank 579 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 579: killed by signal 9
rank 576 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 576: killed by signal 9
rank 480 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 480: killed by signal 9
rank 415 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 415: killed by signal 9
rank 656 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 656: killed by signal 9
rank 671 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 671: killed by signal 9
rank 672 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 672: killed by signal 9
rank 767 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 767: killed by signal 9
rank 703 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 703: killed by signal 9
rank 696 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 696: killed by signal 9
rank 707 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 707: killed by signal 9
rank 706 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 706: killed by signal 9
rank 704 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 704: killed by signal 9
rank 512 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 512: killed by signal 9
rank 245 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 245: killed by signal 9
rank 244 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 244: killed by signal 9
rank 259 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 259: killed by signal 9
rank 258 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 258: killed by signal 9
rank 256 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 256: killed by signal 9
rank 127 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 127: killed by signal 9
rank 131 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 131: killed by signal 9
rank 130 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 130: killed by signal 9
rank 128 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 128: killed by signal 9
rank 78 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 78: killed by signal 9
rank 77 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 77: killed by signal 9
rank 76 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 76: killed by signal 9
rank 0 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 0: killed by signal 9


PS:

Read file <benchmark4_err_file> for stderr output of this job.

User avatar
shchepet
Posts: 185
Joined: Fri Nov 14, 2003 4:57 pm

Re: ROMS not running on 1024 cores for 17532 iterations

#7 Post by shchepet » Sat Nov 13, 2010 7:20 am

This sounds more like MPI problem, not directly ROMS problem, although
it is triggered by ROMS pushing MPI to the limit.
............
0 00:00:00 0.000000E+00 1.963001E+04 1.963001E+04 1.659940E+17
DEF_HIS - creating history file: ./output/ocean_his_0001.nc
rank 447 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 447: killed by signal 9
rank 528 in job 1 n104_34487 caused collective abort of all ranks
.....
What is your NINFO setting in roms.in? Set it to 1.

How many time steps you are actually going through before the
break down occurs? Normally ROMS prints diagnostic output
(time step number, time, kinetic energy, etc...) every NINFO
steps, that is every step if ninfo set to 1. Do you go through
at list few steps before attempting to write history?

Can you run code without any output, that is setting LDEFHIS=.false.
and NHIS larger than the total number of time steps requested?

You need to know all this in order to understand that the problem
is related to I/O or to something else.

Once time stepping is started, ROMS sends only small MPI messages
exchanging data of ghost cells along the perimeter of subdomains
(these are called halos). In you case the subdomains are actually
small, something like 32x34x30 or so, and assuming that the halo
width is 2 points (typically) the associated messages are 34x30x2
double precision numbers, or just ~16 kBytes. They are set between
nodes pairwise, so no node receives more than 4 messages.

Once you go to I/O, the situation in very different. First you are
sending 3D messages which are much bigger, like 32x34x30. Secondly,
you are sending messages to a single target -- everybody send to
node 0 who collects and assembles the data. This is more likely to
cause MPI deadlock and abort due to running out of memory or receiving
too many messages at the same time (the symptom is that when running
few nodes it works, many nodes it breaks down).

If this is the cause (needs to be verified first), MPI distributions
typically have tunable parameters -- buffer sizes -- which are controlled
by environmental variables. Their names are not standardized, so you have
to look at man pages of the specific version of MPI you are using.

Also: I have experience of seeing difficulties with just starting MPI
with many nodes: once the job started, it runs forever, but it may need
several attempts to start. The reason? mpirun is actually a stript which
creates jobs sequentially (MPI 1), or via some some kind of tree algorithm
(faster and more reliable), but in any case some MPI processes are started
earlier and must wait for messages to arrive, while others are not started
yet. The waiting time is limited (again by default which can be changed by
an environmetal variable), so if messages are not received within the time
allowed, the job aborts with error messages similar to what you have.
More nodes means longer startup time fore creation MPI processes, hence
more likelihood for timeout. The solution is to adjust timeout variable.
(Again, read man pages for the specific version of MPI.)

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#8 Post by prakrati » Sat Nov 13, 2010 8:11 am

Thanks for your quick response.Will try what you told and then let you know what happens
Thanks again!

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#9 Post by prakrati » Sat Nov 13, 2010 10:47 am

hey

I ran by setting the MPIEXEC_TIMEOUT=3600 and NHIS > 17532 but still the same error is coming.Will try to change the buffer size and let you know what happens.Till then please think of what can I do to make it work as it is very crucial.

Thanks a lot !!!!

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#10 Post by prakrati » Mon Nov 15, 2010 11:58 am

Thanks It ran with open mpi!!!! :)

User avatar
shchepet
Posts: 185
Joined: Fri Nov 14, 2003 4:57 pm

Re: ROMS not running on 1024 cores for 17532 iterations

#11 Post by shchepet » Mon Nov 15, 2010 4:58 pm

So what was the problem?

Was it just because changing the brand of MPI?

If so, what kind of MPI you was using before?

Any specific setting on tunable parameters?

prakrati
Posts: 24
Joined: Thu Oct 21, 2010 9:35 pm
Location: CRL

Re: ROMS not running on 1024 cores for 17532 iterations

#12 Post by prakrati » Tue Nov 16, 2010 3:41 am

I was using intel mpi earlier.I dont know why it wasnt working with it .I couldnt find any tunable parameter for buffer size for intel mpi.

tony1230
Posts: 87
Joined: Wed Mar 31, 2010 3:29 pm
Location: SKLEC,ECNU,Shanghai,China

Re: ROMS not running on 1024 cores for 17532 iterations

#13 Post by tony1230 » Sat Sep 14, 2013 3:00 am

...
0 00:00:00 0.000000E+00 1.963001E+04 1.963001E+04 1.659940E+17
DEF_HIS - creating history file: ./output/ocean_his_0001.nc
rank 447 in job 1 n104_34487 caused collective abort of all ranks
exit status of rank 447: killed by signal 9
rank 528 in job 1 n104_34487 caused collective abort of all ranks
I was using intel mpi earlier.I dont know why it wasnt working with it .I couldnt find any tunable parameter for buffer size for intel mpi.
hey prakrati, could you plz make this question more clearly? I want to konw what exactly triggered the problem and how you solved it.

thank you

- shou

Post Reply