# R4DVAR Tutorial

## Contents

## Introduction

During this exercise you will apply the dual form strong/weak constraint, 4-Dimensional Variational (**4D-Var**) data assimilation based on the indirect representer algorithm to ROMS configured for the U.S. west coast and the California Current System (CCS). This configuration, referred to as WC13, has 30 km horizontal resolution, and 30 levels in the vertical. While 30 km resolution is inadequate for capturing much of the energetic mesoscale circulation associated with the CCS, WC13 captures the broad scale features of the circulation quite well, and serves as a very useful and efficient illustrative example of R4D-Var.

## Model Set-up

The WC13 model domain is shown in Fig. 1 and has open boundaries along the northern, western, and southern edges of the model domain.

In the tutorial, you will perform a 4D-Var data assimilation cycle that spans the period 3-6 January, 2004. The 4D-Var control vector *δ***z** is comprised of increments to the initial conditions, *δ***x**(*t _{0}*), surface forcing, δ

**f**(

*t*), and open boundary conditions, δ

**b**(

*t*). The

*prior*initial conditions,

**x**(

_{b}*t*), are taken from the sequence of 4D-Var experiments described by Moore

_{0}*et al.*(2011b) in which data were assimilated every 7 days during the period July 2002- December 2004. The

*prior*surface forcing,

**f**(

_{b}*t*), takes the form of surface wind stress, heat flux, and a freshwater flux computed using the ROMS bulk flux formulation, and using near surface air data from COAMPS (Doyle

*et al.*, 2009). Clamped open boundary conditions are imposed on (

*u*,

*v*) and tracers, and the

*prior*boundary conditions,

**b**(

_{b}*t*), are taken from the global ECCO product (Wunsch and Heimbach, 2007). The free-surface height and vertically integrated velocity components are subject to the usual Chapman and Flather radiation conditions at the open boundaries. The

*prior*surface forcing and open boundary conditions are provided daily and linearly interpolated in time. Similarly, the increments

*δ*

**f**(

*t*) and

*δ*

**b**(

*t*) are also computed daily and linearly interpolated in time.

The observations assimilated into the model are satellite SST, satellite SSH in the form of a gridded product from Aviso, and hydrographic observations of temperature and salinity collected from Argo floats and during the GLOBEC/LTOP and CalCOFI cruises off the coast of Oregon and southern California, respectively. The observation locations are illustrated in Fig. 2.

## Running R4D-Var

To run this exercise, go first to the directory WC13/R4DVAR. Instructions for compiling and running the model are provided below or can be found in the Readme file. The recommended configuration for this exercise is one outer-loop and 50 inner-loops, and roms_wc13\.in is configured for this default case. The number of inner-loops is controlled by the parameter Ninner in roms_wc13\.in.

## Important CPP Options

The following C-preprocessing options are activated in the build script:

POSTERIOR_EOFS Estimate posterior analysis error

POSTERIOR_ERROR_I Estimate initial posterior analysis error

WC13 Application CPP option

## Input NetCDF Files

WC13 requires the following input NetCDF files:

Nonlinear Initial File: wc13_ini.nc

Forcing File 01: ../Data/coamps_wc13_lwrad_down.nc

Forcing File 02: ../Data/coamps_wc13_Pair.nc

Forcing File 03: ../Data/coamps_wc13_Qair.nc

Forcing File 04: ../Data/coamps_wc13_rain.nc

Forcing File 05: ../Data/coamps_wc13_swrad.nc

Forcing File 06: ../Data/coamps_wc13_Tair.nc

Forcing File 07: ../Data/coamps_wc13_wind.nc

Boundary File: ../Data/wc13_ecco_bry.nc

Initial Conditions STD File: ../Data/wc13_std_i.nc

Model STD File: ../Data/wc13_std_m.nc

Boundary Conditions STD File: ../Data/wc13_std_b.nc

Surface Forcing STD File: ../Data/wc13_std_f.nc

Initial Conditions Norm File: ../Data/wc13_nrm_i.nc

Model Norm File: ../Data/wc13_nrm_m.nc

Boundary Conditions Norm File: ../Data/wc13_nrm_b.nc

Surface Forcing Norm File: ../Data/wc13_nrm_f.nc

Observations File: wc13_obs.nc

## Various Scripts and Include Files

The following files will be found in WC13/R4DVAR directory after downloading from ROMS test cases SVN repository:

build.bash bash shell script to compile application

build.sh csh Unix script to compile application

job_r4dvar.sh job configuration script

roms_wc13\.in ROMS standard input script for WC13

s4dvar.in 4D-Var standard input script template

wc13.h WC13 header with CPP options

## Instructions

To run this application you need to take the following steps:

- We need to run the model application for a period that is long enough to compute meaningful circulation statistics, like mean and standard deviations for all prognostic state variables (zeta, u, v, T, and S). The standard deviations are written to NetCDF files and are read by the 4D-Var algorithm to convert modeled error correlations to error covariances. The error covariance matrix,
**D**=diag(**B**,_{x}**B**,_{b}**B**,_{f}**Q**), is very large and not well known.**B**is modeled as the solution of a diffusion equation as in Weaver and Courtier (2001). Each covariance matrix is factorized as**B = K Σ C Σ**, where^{T}K^{T}**C**is a univariate correlation matrix,**Σ**is a diagonal matrix of error standard deviations, and**K**is a multivariate balance operator.**K B**) since the balanced operator is activated (BALANCE_OPERATOR and ZETA_ELLIPTIC)._{u}K^{T}*et al.*, 2005).../Data/wc13_std_i.nc initial conditions

../Data/wc13_std_m.nc model error (if weak constraint)

../Data/wc13_std_b.nc open boundary conditions

../Data/wc13_std_f.nc surface forcing (wind stress and net heat flux) - Since we are modeling the error covariance matrix,
**D**, we need to compute the normalization coefficients to ensure that the diagonal elements of the associated correlation matrix**C**are equal to unity. There are two methods to compute normalization coefficients: exact and randomization (an approximation).Nmethod == 0 ! normalization methodThese normalization coefficients have already been computed for you (

Nrandom == 5000 ! randomization iterations

LdefNRM == F F F F ! Create a new normalization files

LwrtNRM == F F F F ! Compute and write normalization

CnormI(isFsur) = T ! 2D variable at RHO-points

CnormI(isUbar) = T ! 2D variable at U-points

CnormI(isVbar) = T ! 2D variable at V-points

CnormI(isUvel) = T ! 3D variable at U-points

CnormI(isVvel) = T ! 3D variable at V-points

CnormI(isTvar) = T T ! NT tracers

CnormB(isFsur) = T ! 2D variable at RHO-points

CnormB(isUbar) = T ! 2D variable at U-points

CnormB(isVbar) = T ! 2D variable at V-points

CnormB(isUvel) = T ! 3D variable at U-points

CnormB(isVvel) = T ! 3D variable at V-points

CnormB(isTvar) = T T ! NT tracers

CnormF(isUstr) = T ! surface U-momentum stress

CnormF(isVstr) = T ! surface V-momentum stress

CnormF(isTsur) = T T ! NT surface tracers flux**../Normalization**) using the exact method since this application has a small grid (54x53x30):../Data/wc13_nrm_i.nc initial conditionsNotice that the switches LdefNRM and LwrtNRM are all

../Data/wc13_nrm_m.nc model error (if weak constraint)

../Data/wc13_nrm_b.nc open boundary conditions

../Data/wc13_nrm_f.nc surface forcing (wind stress and

net heat flux)**false**(F) since we already computed these coefficients. - Customize your preferred build script and provide the appropriate values for:
- Root directory, MY_ROOT_DIR
- ROMS source code, MY_ROMS_SRC
- Fortran compiler, FORT
- MPI flags, USE_MPI and USE_MPIF90
- Path of MPI, NetCDF, and ARPACK libraries according to the compiler. Notice that you need to provide the correct places of these libraries for your computer. If you want to ignore this section, comment out the assignment for the variable USE_MY_LIBS.

- Notice that the most important CPP options for this application are specified in the build script instead of wc13.h:setenv MY_CPP_FLAGS "-DR4DVAR"This is to allow flexibility with different CPP options.

setenv MY_CPP_FLAGS "${MY_CPP_FLAGS} -DPOSTERIOR_EOFS"

setenv MY_CPP_FLAGS "${MY_CPP_FLAGS} -DPOSTERIOR_ERROR_I"**#undef**directives MUST be avoided in the header file wc13.h since it has precedence during C-preprocessing. - You MUST use the build script to compile.
- Customize the ROMS input script roms_wc13\.in and specify the appropriate values for the distributed-memory partition. It is set by default to:Notice that the adjoint-based algorithms can only be run in parallel using MPI. This is because of the way that the adjoint model is constructed.
- Customize the configuration script job_r4dvar.sh and provide the appropriate place for the substitute Perl script:set SUBSTITUTE=${ROMS_ROOT}/ROMS/Bin/substituteThis script is distributed with ROMS and it is found in the ROMS/Bin sub-directory. Alternatively, you can define ROMS_ROOT environmental variable in your .cshrc login script. For example, I have:setenv ROMS_ROOT /home/arango/ocean/toms/repository/trunk
- Execute the configuration job_r4dvar.sh
**before**running the model. It copies the required files and creates r4dvar.in input script from template**s4dvar.in**. This has to be done**every time**that you run this application. We need a clean and fresh copy of the initial conditions and observation files since they are modified by ROMS during execution. - Run ROMS with data assimilation:mpirun -np 4 oceanM roms_wc13\.in > & log &

## Plotting your Results

Several Matlab scripts are provided in the directory WC13/plotting which will allow you to plot some of the R4D-Var output.

Recall that R4D-Var minimizes the cost function given by:

Plot first the R4D-Var cost function and its components , and the theoretical minimum value using the Matlab script plot_r4dvar_cost.m.

Next, plot the surface initial conditions increments and the surface forcing increments at initial time using Matlab script plot_r4dvar_increments.m or ROMS plotting package script ccnt_r4dvar_increments.in for horizontal plots at 100 m or csec_r4dvar_increments.in for cross-sections along 37°N.

## Results

The R4D-Var cost function value for each inner loop iteration is shown below:

The total cost function **J** (black curve), observation cost function **J _{o}** (blue curve), and background cost function

**J**(red curve) are plotted on a log

_{b}_{10}scale. The value of the nonlinear cost function

**J**at the end of the inner loops is also shown (red X). The horizontal, dashed line shows the theoretical

_{NL}**J**.

_{min}

The R4D-Var initial conditions increments for free-surface (m), surface wind stress components (Pa), and surface net heat flux (W/m^{2}) are shown below:

The R4D-Var initial conditions increments at 100m for temperature (°C), salinity, and momentum components (m/s) are shown below:

A cross-section along 37°N for the R4D-Var initial conditions increments is shown below.

R4D-Var posterior analysis error covariance EOF eigenvalues and randomization trace estimates.

The R4D-Var posterior analysis error covariance (EOF 1) for free-surface (m), surface wind stress components (Pa), and surface net heat flux (W/m^{2}) are shown below:

The R4D-Var posterior analysis error covariance (EOF 1) at 100m for temperature (°C), salinity, and momentum components (m/s) are shown below:

A cross-section along 37°N for the R4D-Var posterior analysis error covariance (EOF 1) is shown below.