Analysis-Forecast Cycle Observation Impacts: Difference between revisions

Latest revision as of 17:24, 22 July 2020

Analysis-Forecast Cycle Observation Impacts

Variational Data Assimilation Menu
1. Introduction
2. Incremental 4D Variational Data Assimilation
3. Analysis-Forecast Cycle Observation Impacts

The procedure for computing the observation impacts during the forecast cycle is a little more involved than that for the analysis cycle. However, a separate ROMS driver exists for this. To help illustrate the procedure involved, consider the typical analysis-forecast cycle shown schematically below.

**Figure:** A schematic showing the typical configuration of an analysis-forecast cycle. Analysis cycle $j$ spans the interval $[t_{0}^{j},t_{0}^{j}+\tau ]$ , and is associated with 4D-Var analysis ${\mathbf {x} }_{a}^{j}$ and the forecast ${\mathbf {x} }_{f}^{j}$ . At the start of each cycle there are three available circulation estimates: two forecasts initialized from the previous two adjacent analysis cycles, and the analysis for the current time. These are illustrated at time $t_{0}^{j+2}+\tau$ at the start of cycle $j+3$ .

{\displaystyle j} — **Figure:** A schematic showing the typical configuration of an analysis-forecast cycle. Analysis cycle $j$ spans the interval $[t_{0}^{j},t_{0}^{j}+\tau ]$ , and is associated with 4D-Var analysis ${\mathbf {x} }_{a}^{j}$ and the forecast ${\mathbf {x} }_{f}^{j}$ . At the start of each cycle there are three available circulation estimates: two forecasts initialized from the previous two adjacent analysis cycles, and the analysis for the current time. These are illustrated at time $t_{0}^{j+2}+\tau$ at the start of cycle $j+3$ .

In the figure above, each analysis cycle is assumed to be of length $\tau$ and analysis cycle $j$ spans the interval $[t_{0}^{j},t_{0}^{j}+\tau ]$ . The circulation estimate at time $t_{0}^{j}+\tau$ (i.e. the end of analysis cycle $j$ ) is denoted as ${\mathbf {x} }_{a}^{j}$ and is the initial condition for the forecast spanning the next analysis interval $[t_{0}^{j+1},t_{0}^{j+1}+\tau ]$ . In the figure it is assumed, for convenience only, that the forecast duration is an integer multiple of $\tau$ , but this does not have to be the case, and the code is set up to handle analysis and forecast cycles that are different lengths. The figure shows the analyses and forecasts that result from three adjacent analysis cycles, namely cycles $j$ , $j+1$ and cycle $j+2$ . The analysis ${\mathbf {x} }_{a}^{j}$ at the end of cycle $j$ is used as the initial condition for the forecast ${\mathbf {x} }_{f}^{j}$ of duration $2\tau$ that terminates at time $t_{0}^{j+2}+\tau$ , the end of analysis cycle $j+2$ . Similarly, the analysis ${\mathbf {x} }_{a}^{j+1}$ at the end of cycle $j+1$ is used as the initial condition for the forecast ${\mathbf {x} }_{f}^{j+1}$ of duration $\tau$ and also terminates at time $t_{0}^{j+2}+\tau$ , the end of analysis cycle $j+2$ . After sufficient time has elapsed, a new analysis ${\mathbf {x} }_{a}^{j+2}$ will be computed at this time. Since ${\mathbf {x} }_{a}^{j+2}$ represents our best estimate of the ocean circulation at time $t_{0}^{j+2}+\tau$ it can be used to quantify the veracity of the forecasts ${\mathbf {x} }_{f}^{j}$ and ${\mathbf {x} }_{f}^{j+1}$ . For this reason, ${\mathbf {x} }_{a}^{j+2}$ is usually referred to as the “verifying analysis.” However, as discussed shortly, other sources of information can be used to verify the forecasts, such as new or independent observations.

It should be clear from the figure that the forecast ${\mathbf {x} }_{f}^{j+1}$ benefits from the observations assimilated into the model during analysis cycle $j+1$ (i.e. during the interval $[t_{0}^{j+1},t_{0}^{j+1}+\tau ]$ ). Therefore, providing that ${\mathbf {x} }_{f}^{j}$ and ${\mathbf {x} }_{f}^{j+1}$ are subject to identical surface forcing and open boundary conditions during the interval $[t_{0}^{j+2},t_{0}^{j+2}+\tau ]$ , any differences in forecast error must be associated with the observations assimilated into the model during the interval $[t_{0}^{j+1},t_{0}^{j+1}+\tau ]$ .

Forecast Error Metrics

As in the case of the analysis cycle observation impacts described above, the impact of the observations during the forecast cycle is computed for a specific metric, in this case a metric of the forecast error. The methodology will be described first for a standard generic quadratic forecast error metric given by:

e=({\mathbf {x} }_{f}-{\mathbf {x} }_{t})^{T}{\mathbf {C} }({\mathbf {x} }_{f}-{\mathbf {x} }_{t})

(1)

where ${\mathbf {x} }_{f}$ denotes the forecast state-vector, ${\mathbf {x} }_{t}$ denotes the true state-vector, and ${\mathbf {C} }$ is a weight matrix. For example, if ${\mathbf {C} }$ is a diagonal matrix with elements equal to 1 corresponding to all surface temperature grid points, and zero elsewhere, then $e$ would represent the sum of the squared errors in SST. Forecast error metrics of the form (1) are very common in numerical weather prediction and oceanography, so (1) is a good starting point.

In the figure there are two forecasts of interest: ${\mathbf {x} }_{f}^{j}$ initialized at the end of analysis cycle $j$ at time $t_{0}^{j}+\tau$ , and ${\mathbf {x} }_{f}^{j+1}$ initialized at the end of analysis cycle $j+1$ at time $t_{0}^{j+1}+\tau$ . At time $t_{0}^{j+2}+\tau$ the error in forecast ${\mathbf {x} }_{f}^{j}$ is given by $e_{b}=({\mathbf {x} }_{f}^{j}-{\mathbf {x} }_{t})^{T}{\mathbf {C} }({\mathbf {x} }_{f}^{j}-{\mathbf {x} }_{t})$ , while the error in $x_{f}^{j+1}$ is given by $e_{a}=({\mathbf {x} }_{f}^{j+1}-{\mathbf {x} }_{t})^{T}{\mathbf {C} }({\mathbf {x} }_{f}^{j+1}-{\mathbf {x} }_{t})$ . As noted above, if ${\mathbf {x} }_{f}^{j}$ and ${\mathbf {x} }_{f}^{j+1}$ are subject to identical surface forcing and open boundary conditions during the interval $[t_{0}^{j+2},t_{0}^{j+2}+\tau ]$ , then the difference in forecast error $\delta e=e_{a}-e_{b}$ is due solely to the difference in the forecast initial conditions due to the observations assimilated during analysis cycle $j+1$ spanning the interval $[t_{0}^{j+1},t_{0}^{j+1}+\tau ]$ . (The more realistic case where the two forecasts are subject to different surface forcing fields is addressed below in the step-by-step procedure notes). Specifically, if $\delta e<0$ the observations assimilated during cycle $j+1$ lead to an improvement in the forecast skill (i.e. $e_{a}<e_{b}$ ), while if $\delta e>0$ the observations assimilated during cycle $j+1$ have degraded the forecast (i.e. $e_{a}>e_{b}$ ). While this convention may seem counter-intuitive, it is the convention used in the numerical weather prediction literature, so it seems prudent to adopt it here.

Using a verifying analysis as a surrogate for the true ocean circulation

In practice, the true state ${\mathbf {x} }_{t}$ will never be known, so the forecast error (1) is usually computed relative to the verifying analysis ${\mathbf {x} }_{a}^{j+2}$ , in which case:

e=({\mathbf {x} }_{f}-{\mathbf {x} }_{a})^{T}{\mathbf {C} }({\mathbf {x} }_{f}-{\mathbf {x} }_{a})

(2)

where ${\mathbf {x} }_{a}$ denotes the verifying analysis at the appropriate forecast time.

The 3^rd-order approximation for $\delta e$ is given by:

\delta e_{3}={\mathbf {d} }^{T}{\mathbf {K} }^{T}{\mathbf {M} }_{b}^{T}[{\mathbf {M} }_{j}^{T}{\mathbf {C} }({\mathbf {x} }_{f}^{j}-{\mathbf {x} }_{a}^{j+2})+{\mathbf {M} }_{j+1}^{T}{\mathbf {C} }({\mathbf {x} }_{f}^{j+1}-{\mathbf {x} }_{a}^{j+2})]

(3)

where ${\mathbf {M} }_{j}^{T}$ represents the adjoint model run backwards over the forecast interval $[t_{0}^{j+2},t_{0}^{j+2}+\tau ]$ and linearized about the forecast solution ${\mathbf {x} }_{f}^{j}$ ; ${\mathbf {M} }_{b}^{T}$ is the adjoint model run backwards over the 4D-Var analysis interval $[t_{0}^{j+1},t_{0}^{j+1}+\tau ]$ and linearized about 4D-Var background ${\mathbf {x} }_{b}$ ; and ${\mathbf {M} }_{j+1}^{T}$ denotes the adjoint model linearized about the forecast solution ${\mathbf {x} }_{f}^{j+1}$ . Thus, the 3^rd-order impact given by (3) requires two integrations of the adjoint model: one forced by ${\mathbf {C} }({\mathbf {x} }_{f}^{j}-{\mathbf {x} }_{a}^{j+2})$ and another forced by ${\mathbf {C} }({\mathbf {x} }_{f}^{j+1}-{\mathbf {x} }_{a}^{j+2})$ and linearized about different forecast solutions.

Using the independent observations as a surrogate for the true ocean circulation

As discussed above, it is typical in operational numerical weather prediction to use the verifying analysis at the forecast time as a surrogate for the true state of the system, as in equation (2). These days operational weather prediction models generally yield very high-quality analyses, so the assumption that ${\mathbf {x} }_{a}$ is a reasonable approximation for ${\mathbf {x} }_{t}$ is probably reasonable. In oceanography, however, this is a more questionable assumption, so when possible, it may be more prudent to verify a forecast against independent observations, or observations that have not yet been assimilated into the model. In this case, equation (2) would be reformulated as:

e=({\mathbf {y} }_{f}-{\mathbf {y} })^{T}{\mathbf {C} }({\mathbf {y} }_{f}-{\mathbf {y} })

(4)

where ${\mathbf {y} }_{f}$ is model forecast of the vector of verifying observations ${\mathbf {y} }$ . In this case, the 3^rd-order approximation for $\delta e$ becomes:

\delta e_{3}={\mathbf {d} }^{T}{\mathbf {K} }^{T}{\mathbf {M} }_{b}^{T}[{\mathbf {G} }_{j}^{T}{\mathbf {C} }({\mathbf {y} }_{f}^{j}-{\mathbf {y} }^{j+2})+{\mathbf {G} }_{j+1}^{T}{\mathbf {C} }({\mathbf {y} }_{f}^{j+1}-{\mathbf {y} }^{j+2})]

(5)

where ${\mathbf {G} }_{j}^{T}$ and ${\mathbf {G} }_{j+1}^{T}$ denote the adjoint model forced at the observation points and linearized about ${\mathbf {x} }_{f}^{j}$ and ${\mathbf {x} }_{f}^{j+1}$ respectively.

Practicalities

The observation impact driver for the forecast cycles is activated using the RBL4DVAR_FCT_SENSITIVITY cpp option.
The default option is 3^rd-order approximations of the squared forecast error difference $\delta e$ . 1^st- and 2^nd-order cases can also be run by changing the parameter ImpOrd in the routine obs_sen_w4dpsas_forecast.h but this is recommended only for testing purposes and once you feel confident about how things work.
The length of the analysis cycle is assigned in roms.in using the parameter NTIMES_ANA.
The length of the forecast cycle is assigned in roms.in using the parameter NTIMES_FCT.
The default configuration of the driver is to use a verifying analysis to compute the forecast errors as in equation (1). Two adjoint model forcing input files are required: one that corresponds to ${\mathbf {C} }({\mathbf {x} }_{f}^{j}-{\mathbf {x} }_{a}^{j+2})$ in equation (3) and one that corresponds to ${\mathbf {C} }({\mathbf {x} }_{f}^{j+1}-{\mathbf {x} }_{a}^{j+2})$ . These are referred to respectively as FCTnameB and FCTnameA in roms.in.
For forecast error metrics in observation space as in equation (4), it is necessary to also define the OBS_SPACE cpp option. In this case the input forcing files for the adjoint model have the same structure as an observation file. In the code they are referred to as FOInameA and FOInameB in roms.in corresponding to ${\mathbf {C} }({\mathbf {y} }_{f}^{j+1}-{\mathbf {y} }^{j+2})$ and ${\mathbf {C} }({\mathbf {y} }_{f}^{j}-{\mathbf {y} }^{j+2})$ respectively in equation (5). The actual observations assimilated during the analysis cycle prior to the forecast cycle are referred to as obs_C.nc in the driver.
The options OBS_IMPACT_SPLIT and IMPACT_INNER can also be used.

Step-by-Step Procedure

Run 4D-Var for the analysis cycle and save the MODname.nc and FWDname.nc files.
Run the forecast initialized from the 4D-Var analysis at the end of the analysis cycle and write the surface fluxes and wind stress to the HISname.nc file. Also ensure that you define FORWARD_WRITE and VERIFICATION, and save the MODname.nc and HISname.nc files.
Rerun step 2 without BULK_FLUXES and instead use the saved surface fluxes and wind stress in the history file from step 2 to force the model. This represents the forecast ${\mathbf {x} }_{f}^{j+1}$ in equation (3) (i.e. the red curve in the figure). This step is necessary because the forecast run using the background in step 4 below to generate ${\mathbf {x} }_{f}^{j}$ must also be subject to the same surface fluxes, wind stress, and open boundary conditions. Any difference in the forecast errors in ${\mathbf {x} }_{f}^{j+1}$ and those from the forecast in step 2 will be due to the differences in the surface boundary conditions. As in step 2, ensure that you define FORWARD_WRITE and VERIFICATION, and save the MODname.nc and HISname.nc files.
Run a forecast without BULK_FLUXES initialized from the 4D-Var background solution at the end of the 4D-Var window using the saved surface fluxes and wind stress in the history file from step 2 to force the model. This represents the forecast ${\mathbf {x} }_{f}^{j}$ in equation (3) (i.e. the portion of the blue curve in the figure spanning the interval $[t_{0}^{j+1}+\tau ,t_{0}^{j+2}+\tau ]$ ). Recall that in order for equation (3) to hold, ${\mathbf {x} }_{f}^{j+1}$ and ${\mathbf {x} }_{f}^{j}$ must be subject to the same surface and open boundary conditions. Steps 2, 3 and 4 ensure this. As in step 2, ensure that you define FORWARD_WRITE and VERIFICATION, and save the MODname.nc and HISname.nc files.
When the verification time arrives, compute the verifying 4D-Var analysis, ${\mathbf {x} }_{a}^{j+2}$ in equation (3).
Using the FWDname.nc and MODname.nc files from steps 3 and 4, prepare the adjoint forcing file FCTnameA corresponding to ${\mathbf {C} }({\mathbf {x} }_{f}^{j+1}-{\mathbf {x} }_{a}^{j+2})$ in equation (3) and FCTnameB corresponding to ${\mathbf {C} }({\mathbf {x} }_{f}^{j}-{\mathbf {x} }_{a}^{j+2})$ if using the verifying analysis as a surrogate for the true ocean state. Similarly, if using independent or unassimilated observations as a surrogate for the true ocean state, prepare the adjoint forcing files in observation space FOInameA and FOInameB corresponding to ${\mathbf {C} }({\mathbf {y} }_{f}^{j+1}-{\mathbf {y} }^{j+2})$ and ${\mathbf {C} }({\mathbf {y} }_{f}^{j}-{\mathbf {y} }^{j+2})$ respectively in equation (5).
Run the forecast observation impact driver.

@@ Line 1: / Line 1: @@
 <div class="title">Analysis-Forecast Cycle Observation Impacts</div>
+<!-- Edit Template:Variational_Data_Assimilation_TOC to modify this Table of Contents-->
+<div style="float: left;margin: 0 20px 0 0;">{{Variational Data Assimilation TOC}}</div>__TOC__
 The procedure for computing the observation impacts during the forecast cycle is a little more involved than that for the analysis cycle. However, a separate ROMS driver exists for this. To help illustrate the procedure involved, consider the typical analysis-forecast cycle shown schematically below.
@@ Line 5: / Line 9: @@
-[[Image:analysis-forecast_cycle_schematic.png|center|frame|'''Figure:''' A schematic showing the typical configuration of an analysis-forecast cycle. Analysis cycle <math>j</math> spans the interval <math>[t_0^j,t_0^j+\tau]</math>, and is associated with 4D-Var analysis <math>x_a^j</math> and the forecast <math>x_f^j</math>. At the start of each cycle there are three available circulation estimates: two forecasts initialized from the previous two adjacent analysis cycles, and the analysis for the current time. These are illustrated at time <math>t_0^{j+2}+\tau</math> at the start of cycle <math>j+3</math>.]]
+[[Image:analysis-forecast_cycle_schematic.png|center|frame|'''Figure:''' A schematic showing the typical configuration of an analysis-forecast cycle. Analysis cycle <math>j</math> spans the interval <math>[t_0^j,t_0^j+\tau]</math>, and is associated with 4D-Var analysis <math>\bold{x}_a^j</math> and the forecast <math>\bold{x}_f^j</math>. At the start of each cycle there are three available circulation estimates: two forecasts initialized from the previous two adjacent analysis cycles, and the analysis for the current time. These are illustrated at time <math>t_0^{j+2}+\tau</math> at the start of cycle <math>j+3</math>.]]
+In the figure above, each analysis cycle is assumed to be of length <math>\tau</math> and analysis cycle <math>j</math> spans the interval <math>[t_0^j,t_0^j+\tau]</math>. The circulation estimate at time <math>t_0^j+\tau</math> (''i.e.'' the ''end'' of analysis cycle <math>j</math>) is denoted as <math>\bold{x}_a^j</math> and is the initial condition for the forecast spanning the next analysis interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>. <span class="red">''In the figure it is assumed, for convenience only, that the forecast duration is an integer multiple of <math>\tau</math>, but this does not have to be the case, and the code is set up to handle analysis and forecast cycles that are different lengths.''</span> The figure shows the analyses and forecasts that result from three adjacent analysis cycles, namely cycles <math>j</math>, <math>j+1</math> and cycle <math>j+2</math>. The analysis <math>\bold{x}_a^j</math> at the ''end'' of cycle <math>j</math> is used as the initial condition for the forecast <math>\bold{x}_f^j</math> of duration <math>2\tau</math> that terminates at time <math>t_0^{j+2}+\tau</math>, the end of analysis cycle <math>j+2</math>. Similarly, the analysis <math>\bold{x}_a^{j+1}</math> at the ''end'' of cycle <math>j+1</math> is used as the initial condition for the forecast <math>\bold{x}_f^{j+1}</math> of duration <math>\tau</math> and also terminates at time <math>t_0^{j+2}+\tau</math>, the end of analysis cycle <math>j+2</math>. After sufficient time has elapsed, a new analysis <math>\bold{x}_a^{j+2}</math> will be computed at this time. Since <math>\bold{x}_a^{j+2}</math> represents our best estimate of the ocean circulation at time <math>t_0^{j+2}+\tau</math> it can be used to quantify the veracity of the forecasts <math>\bold{x}_f^j</math> and <math>\bold{x}_f^{j+1}</math>. For this reason, <math>\bold{x}_a^{j+2}</math> is usually referred to as the “verifying analysis.” However, as discussed shortly, other sources of information can be used to verify the forecasts, such as new or independent observations.
+It should be clear from the figure that the forecast <math>\bold{x}_f^{j+1}</math> benefits from the observations assimilated into the model during analysis cycle <math>j+1</math> (i.e. during the interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>). Therefore, providing that <math>\bold{x}_f^j</math> and <math>\bold{x}_f^{j+1}</math> are subject to ''identical'' surface forcing and open boundary conditions during the interval <math>[t_0^{j+2},t_0^{j+2}+\tau]</math>, any differences in forecast error must be associated with the observations assimilated into the model during the interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>.
+==Forecast Error Metrics==
+As in the case of the analysis cycle observation impacts described above, the impact of the observations during the forecast cycle is computed for a specific metric, in this case a metric of the forecast error. The methodology will be described first for a standard generic quadratic forecast error metric given by:
+{| class="eqno"
+|<math display="block">e=(\bold{x}_f-\bold{x}_t)^T \bold{C}(\bold{x}_f-\bold{x}_t)</math>
+|(1)
+|}
+where <math>\bold{x}_f</math> denotes the forecast state-vector, <math>\bold{x}_t</math> denotes the true state-vector, and <math>\bold{C}</math> is a weight matrix. For example, if <math>\bold{C}</math> is a diagonal matrix with elements equal to 1 corresponding to all surface temperature grid points, and zero elsewhere, then <math>e</math> would represent the sum of the squared errors in SST. Forecast error metrics of the form (1) are very common in numerical weather prediction and oceanography, so (1) is a good starting point.
+In the figure there are two forecasts of interest: <math>\bold{x}_f^j</math> initialized at the end of analysis cycle <math>j</math> at time <math>t_0^j+\tau</math>, and <math>\bold{x}_f^{j+1}</math> initialized at the end of analysis cycle <math>j+1</math> at time <math>t_0^{j+1}+\tau</math>. At time <math>t_0^{j+2}+\tau</math> the error in forecast <math>\bold{x}_f^j</math> is given by <math>e_b=(\bold{x}_f^j-\bold{x}_t )^T \bold{C}(\bold{x}_f^j-\bold{x}_t )</math>, while the error in <math>x_f^{j+1}</math> is given by <math>e_a=(\bold{x}_f^{j+1}-\bold{x}_t)^T \bold{C}(\bold{x}_f^{j+1}-\bold{x}_t)</math>. As noted above, if <math>\bold{x}_f^j</math> and <math>\bold{x}_f^{j+1}</math> are subject to ''identical'' surface forcing and open boundary conditions during the interval <math>[t_0^{j+2},t_0^{j+2}+\tau]</math>, then the difference in forecast error <math>\delta e=e_a-e_b</math> is due solely to the difference in the forecast initial conditions due to the observations assimilated during analysis cycle <math>j+1</math> spanning the interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>. (The more realistic case where the two forecasts are subject to different surface forcing fields is addressed below in the [[#Step-by-Step_Procedure|step-by-step procedure]] notes). Specifically, if <math>\delta e<0</math> the observations assimilated during cycle <math>j+1</math> lead to an improvement in the forecast skill (''i.e.'' <math>e_a<e_b</math>), while if <math>\delta e>0</math> the observations assimilated during cycle <math>j+1</math> have degraded the forecast (''i.e.'' <math>e_a>e_b</math>). While this convention may seem counter-intuitive, it is the convention used in the numerical weather prediction literature, so it seems prudent to adopt it here.
+===Using a verifying analysis as a surrogate for the true ocean circulation===
+In practice, the true state <math>\bold{x}_t</math> will never be known, so the forecast error (1) is usually computed relative to the verifying analysis <math>\bold{x}_a^{j+2}</math>, in which case:
+{| class="eqno"
+|<math display="block">e=(\bold{x}_f-\bold{x}_a)^T \bold{C}(\bold{x}_f-\bold{x}_a)</math>
+|(2)
+|}
+where <math>\bold{x}_a</math> denotes the verifying analysis at the appropriate forecast time.
+The 3<sup>rd</sup>-order approximation for <math>\delta e</math> is given by:
+{| class="eqno"
+|<math display="block">\delta e_3 = \bold{d}^T\bold{K}^T\bold{M}_b^T[\bold{M}_j^T\bold{C}(\bold{x}_f^j-\bold{x}_a^{j+2})+\bold{M}_{j+1}^T\bold{C}(\bold{x}_f^{j+1}-\bold{x}_a^{j+2})]</math>
+|(3)
+|}
+where <math>\bold{M}_j^T</math> represents the adjoint model run backwards over the forecast interval <math>[t_0^{j+2},t_0^{j+2}+\tau]</math> and linearized about the forecast solution  <math>\bold{x}_f^j</math>; <math>\bold{M}_b^T</math> is the adjoint model run backwards over the 4D-Var analysis interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math> and linearized about 4D-Var background <math>\bold{x}_b</math>; and  <math>\bold{M}_{j+1}^T</math> denotes the adjoint model linearized about the forecast solution <math>\bold{x}_f^{j+1}</math>. Thus, the 3<sup>rd</sup>-order impact given by (3) requires two integrations of the adjoint model: one forced by <math>\bold{C}(\bold{x}_f^j-\bold{x}_a^{j+2})</math> and another forced by <math>\bold{C}(\bold{x}_f^{j+1}-\bold{x}_a^{j+2})</math> and linearized about different forecast solutions.
+===Using the independent observations as a surrogate for the true ocean circulation===
+As discussed above, it is typical in operational numerical weather prediction to use the verifying analysis at the forecast time as a surrogate for the true state of the system, as in equation (2). These days operational weather prediction models generally yield very high-quality analyses, so the assumption that <math>\bold{x}_a</math> is a reasonable approximation for <math>\bold{x}_t</math> is probably reasonable. In oceanography, however, this is a more questionable assumption, so when possible, it may be more prudent to verify a forecast against independent observations, or observations that have not yet been assimilated into the model. In this case, equation (2) would be reformulated as:
+{| class="eqno"
+|<math display="block">e=(\bold{y}_f-\bold{y})^T\bold{C}(\bold{y}_f-\bold{y})</math>
+|(4)
+|}
+where <math>\bold{y}_f</math> is model forecast of the vector of verifying observations <math>\bold{y}</math>. In this case, the 3<sup>rd</sup>-order approximation for <math>\delta e</math> becomes:
+{| class="eqno"
+|<math display="block">\delta e_3 = \bold{d}^T\bold{K}^T\bold{M}_b^T[\bold{G}_j^T\bold{C}(\bold{y}_f^j-\bold{y}^{j+2})+\bold{G}_{j+1}^T\bold{C}(\bold{y}_f^{j+1}-\bold{y}^{j+2})]</math>
+|(5)
+|}
+where <math>\bold{G}_j^T</math> and <math>\bold{G}_{j+1}^T</math> denote the adjoint model ''forced'' at the observation points and linearized about <math>\bold{x}_f^j</math> and <math>\bold{x}_f^{j+1}</math> respectively.
-In the figure above, each analysis cycle is assumed to be of length <math>\tau</math> and analysis cycle <math>j</math> spans the interval<math>[t_0^j,t_0^j+\tau]</math>. The circulation estimate at time <math>t_0^j+\tau</math> (i.e. the ''end'' of analysis cycle <math>j</math>) is denoted as <math>x_a^j</math> and is the initial condition for the forecast spanning the next analysis interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>. In sequel it is assumed that the forecast duration is an integer multiple of <math>\tau</math>, but this is not a necessary constraint. The figure shows the analyses and forecasts that result from three adjacent analysis cycles, namely cycles <math>j</math>, <math>j+1</math> and cycle <math>j+2</math>. The analysis <math>x_a^j</math> at the ''end'' of cycle <math>j</math> is used as the initial condition for the forecast <math>x_f^j</math> of duration <math>2\tau</math> that terminates at time <math>t_0^{j+2}+\tau</math>, the end of analysis cycle <math>j+2</math>. Similarly, the analysis <math>x_a^{j+1}</math> at the ''end'' of cycle <math>j+1</math> is used as the initial condition for the forecast <math>x_f^{j+1}</math> of duration <math>\tau</math> and also terminates at time <math>t_0^{j+2}+\tau</math>, the end of analysis cycle <math>j+2</math>. After sufficient time has elapsed, a new analysis <math>x_a^{j+2}</math> will be computed at this time. Since <math>x_a^{j+2}</math> represents our best estimate of the ocean circulation at time <math>t_0^{j+2}+\tau</math> it can be used to quantify the veracity of the forecasts <math>x_f^j</math> and <math>x_f^{j+1}</math>. For this reason, <math>x_a^{j+2}</math> is usually referred to as the “verifying analysis.” However, as discussed, later other sources of information can be used to verify the forecasts, such as new or independent observations.
+==Practicalities==
+#The observation impact driver for the forecast cycles is activated using the [[Options#RBL4DVAR_FCT_SENSITIVITY|RBL4DVAR_FCT_SENSITIVITY]] cpp option.
+#The default option is 3<sup>rd</sup>-order approximations of the squared forecast error difference <math>\delta e</math>. 1<sup>st</sup>- and 2<sup>nd</sup>-order cases can also be run by changing the parameter <span class="limeGreen">ImpOrd</span> in the routine <span class="red">obs_sen_w4dpsas_forecast.h</span> but this is recommended ''only'' for testing purposes and once you feel confident about how things work.
+#The length of the analysis cycle is assigned in [[roms.in]] using the parameter [[Variables#ntimes_ana|NTIMES_ANA]].
+#The length of the forecast cycle is assigned in [[roms.in]] using the parameter [[Variables#ntimes_ana|NTIMES_FCT]].
+#The default configuration of the driver is to use a verifying analysis to compute the forecast errors as in equation (1). Two adjoint model forcing input files are required: one that corresponds to <math>\bold{C}(\bold{x}_f^j-\bold{x}_a^{j+2})</math> in equation (3) and one that corresponds to <math>\bold{C}(\bold{x}_f^{j+1}-\bold{x}_a^{j+2})</math>. These are referred to respectively as [[Variables#FCTnameB|FCTnameB]] and [[Variables#FCTnameA|FCTnameA]] in [[roms.in]].
+#For forecast error metrics in observation space as in equation (4), it is necessary to also define the [[Options#OBS_SPACE|OBS_SPACE]] cpp option. In this case the input forcing files for the adjoint model have the same structure as an observation file. In the code they are referred to as [[Variables#FOInameA|FOInameA]] and [[Variables#FOInameB|FOInameB]] in [[roms.in]] corresponding to <math>\bold{C}(\bold{y}_f^{j+1}-\bold{y}^{j+2})</math> and <math>\bold{C}(\bold{y}_f^j-\bold{y}^{j+2})</math> respectively in equation (5). The actual observations assimilated during the analysis cycle prior to the forecast cycle are referred to as <span class="limeGreen">obs_C.nc</span> in the driver.
+#The options [[Options#OBS_IMPACT_SPLIT|OBS_IMPACT_SPLIT]] and [[Options#IMPACT_INNER|IMPACT_INNER]] can also be used.
-It should be clear from the figure that the forecast <math>x_f^{j+1}</math> benefits from the observations assimilated into the model during analysis cycle <math>j+1</math> (i.e. during the interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>). Therefore, providing that <math>x_f^j</math> and <math>x_f^{j+1}</math> are subject to identical surface forcing and open boundary conditions during the interval <math>[t_0^{j+2},t_0^{j+2}+\tau]</math>, any differences in forecast error must be associated with the observations assimilated into the model during the interval <math>[t_0^{j+1},t_0^{j+1}+\tau]</math>. The impact of each
+==Step-by-Step Procedure==
-observation on the forecast error can be quantified as described next which is based on the work of Langland and Baker (2004), Errico (2007) and Gelaro ''et al.'' (2007).
+#Run 4D-Var for the analysis cycle and save the <span class="limeGreen">MODname.nc</span> and <span class="limeGreen">FWDname.nc</span> files.
+#Run the forecast initialized from the 4D-Var analysis at the ''end'' of the analysis cycle and write the surface fluxes and wind stress to the <span class="limeGreen">HISname.nc</span> file. Also ensure that you ''define'' [[Options#FORWARD_WRITE|FORWARD_WRITE]] and [[Options#VERIFICATION|VERIFICATION]], and save the <span class="limeGreen">MODname.nc</span> and <span class="limeGreen">HISname.nc</span> files.
+#Rerun step 2 '''''without''''' [[Options#BULK_FLUXES|BULK_FLUXES]] and instead use the saved surface fluxes and wind stress in the history file from step 2 to force the model. This represents the forecast <math>\bold{x}_f^{j+1}</math> in equation (3) (''i.e.'' the red curve in the figure). This step is necessary because the forecast run using the background in step 4 below to generate <math>\bold{x}_f^j</math> must also be subject to the same surface fluxes, wind stress, and open boundary conditions. Any difference in the forecast errors in <math>\bold{x}_f^{j+1}</math> and those from the forecast in step 2 will be due to the differences in the surface boundary conditions. As in step 2, ensure that you ''define'' [[Options#FORWARD_WRITE|FORWARD_WRITE]] and [[Options#VERIFICATION|VERIFICATION]], and save the <span class="limeGreen">MODname.nc</span> and <span class="limeGreen">HISname.nc</span> files.
+#Run a forecast '''''without''''' [[Options#BULK_FLUXES|BULK_FLUXES]] initialized from the 4D-Var ''background'' solution at the ''end'' of the 4D-Var window using the saved surface fluxes and wind stress in the history file from step 2 to force the model. This represents the forecast <math>\bold{x}_f^j</math> in equation (3) (''i.e.'' the portion of the blue curve in the figure spanning the interval <math>[t_0^{j+1}+\tau,t_0^{j+2}+\tau]</math>). '''Recall that in order for equation (3) to hold, <math>\bold{x}_f^{j+1}</math> and <math>\bold{x}_f^j</math> must be subject to the same surface and open boundary conditions. Steps 2, 3 and 4 ensure this.''' As in step 2, ensure that you ''define'' [[Options#FORWARD_WRITE|FORWARD_WRITE]] and [[Options#VERIFICATION|VERIFICATION]], and save the <span class="limeGreen">MODname.nc</span> and <span class="limeGreen">HISname.nc</span> files.
+#When the verification time arrives, compute the verifying 4D-Var analysis, <math>\bold{x}_a^{j+2}</math> in equation (3).
+#Using the <span class="limeGreen">FWDname.nc</span> and <span class="limeGreen">MODname.nc</span> files from steps 3 and 4, prepare the adjoint forcing file [[variables#FCTnameA|FCTnameA]] corresponding to <math>\bold{C}(\bold{x}_f^{j+1}-\bold{x}_a^{j+2})</math> in equation (3) and [[Variables#FCTnameB|FCTnameB]] corresponding to <math>\bold{C}(\bold{x}_f^j-\bold{x}_a^{j+2})</math> if using the verifying analysis as a surrogate for the true ocean state. Similarly, if using independent or unassimilated observations as a surrogate for the true ocean state, prepare the adjoint forcing files in observation space [[Variables#FOInameA|FOInameA]] and [[Variable#FOInameB|FOInameB]] corresponding to <math>\bold{C}(\bold{y}_f^{j+1}-\bold{y}^{j+2})</math> and <math>\bold{C}(\bold{y}_f^j-\bold{y}^{j+2})</math> respectively in equation (5).
+#Run the forecast observation impact driver.

Analysis-Forecast Cycle Observation Impacts: Difference between revisions

Latest revision as of 17:24, 22 July 2020

Contents

Forecast Error Metrics

Using a verifying analysis as a surrogate for the true ocean circulation

Using the independent observations as a surrogate for the true ocean circulation

Practicalities

Step-by-Step Procedure

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Help

Tools