Current Usage of ALPS Data and Future Challenges for ALPS Network Design

Perspectives from Operational Data Assimilation and Climate State and Parameter Estimation

An T. Nguyen and Patrick Heimbach

ALPS in Data Assimilation and Estimation

Since the early 2000s, ALPS data have been an invaluable component of data assimilation (DA) in operational oceanography (e.g., Martin et al., 2015; Oke et al., 2015), and state and parameter estimation (SPE) for climate research (Wunsch and Heimbach, 2013; Edwards et al., 2015; Stammer et al., 2016). A quantitative assessment of how “useful” or “critical” any set of ALPS data is to DA and SPE systems depends on the system’s scientific goal and application. Common to both systems, the aim is to obtain the time-evolving description of the ocean (and sea ice) over temporal scales ranging from days to many decades (Stammer et al., 2016). Functioning as a temporal and spatial interpolator, the underlying numerical and/or statistical models fill the gaps between sparse observations from ALPS and other diverse streams to produce an optimally “merged” product (Figure 1) to serve specific needs of the end users.

Figure 1. Schematic difference between data assimilation (DA) and state and parameter estimation (SPE) systems. Trajectories of DA and SPE systems are depicted with solid black and blue lines, respectively. In a DA system, at the end of each DA assimilation window, the model trajectory can lead to an ocean state (black cross) that diverges from observations (gray triangle), and a correction (re-initialization) can bring the model toward the observation (to within pre-defined criteria, red cross). Unphysical “discontinuities” can potentially be introduced in this correction step (red vertical lines) and can be mitigated through incremental adjustments (dashed black line), though the resultant smooth solution can remain dynamically unbalanced. SPE system trajectory (blue solid line) matches observations to within a pre-defined uncertainty range and guarantees conservation of heat, salt, and momentum over the entire estimation period. Figure adapted from Stammer et al. (2016)

 In operational forecasts and ocean reanalyses, data streams are typically assimilated within a specific time window whose length is governed by practical needs (e.g., availability and quality control of the data, and computing times to produce the analysis and forecast), and the system’s “predictive” skill (black solid and dashed curves in Figure 1). Predictability refers to the time scale over which a model trajectory remains within a tolerable threshold defined by, for example, the ensemble standard deviation or the combined model-data errors (Robinson et al., 2002; Edwards et al., 2015; Oke et al., 2015). Examples of practical needs include the ability to predict the presence of sea ice to mitigate potential shipping hazards, paths of an oil spill to mitigate the potential environmental damage, or paths of warm currents to follow schools of fish to maximize potential catch.

The aim of state and parameter estimation is toward “understanding” of processes at multidecadal to longer time scales. These systems emphasize the underlying model dynamics and property conservation implied by the equations of motion. They utilize data to constrain the state estimate’s overall trajectory, fitting the data to within data and model representation uncertainty, over the entire estimation period of up to a multidecadal time scale. In addition to being used to invert for optimal initial conditions as in operational DA, ALPS and other complementary data sets (e.g., from satellites, surface drifters, ship-based and moored instruments) are also used to estimate time-mean internal model parameters and time-varying adjustments to lateral/surface fluxes (Stammer, 2005; Moore et al., 2011a; Liu et al., 2012; Forget et al., 2015b).

Following the success of satellite altimetric data available since the early 1990s (Wunsch and Stammer, 1998) for constraining upper ocean circulation, since the mid-2000s Argo has become the single most important data source for constraining subsurface hydrographic mean state and variability (e.g., Wunsch et al., 2009; Forget et al., 2015a; Oke et al., 2015). In a review of several representative DA systems, Oke et al. (2015) concluded that the Argo data set is “unanimously” critical to all systems, in particular at depths and in constraining the global salinity. Similarly, Liu et al. (2012) and Forget et al. (2015b) showed that significant reduction of global temperature and salinity misfits was achieved through improved global estimates of ocean mixing parameters, with Argo, ship-based hydrography, and satellite altimetry being used as primary constraints. In coastal regions or where Argo data coverage is too sparse, dedicated ALPS data sets from gliders and Lagrangian ocean drifters have contributed significantly to improving representation of regional oceanography to serve specific needs, ranging from surface oil spill prediction to tracking fishery along the California coast (e.g., Todd et al., 2011; Poje et al., 2014; Edwards et al., 2015, and references therein).


Synergy Between DA/SPE Frameworks and ALPS to Address Scientific and Technological Challenges

The successful use of ALPS data, in particular, Argo observations, in DA/SPE systems is widely attributed to the accessibility of the data and the quasi-global coverage of independent subsurface observations that complement satellite observations in improving estimates of ocean state and its uncertainties. However, relevant to the discussion here, no single observation platform can address all the scientific questions and technical challenges of DA/SPE systems. Below is a list of some outstanding challenges that can be addressed with future synergy between DA/SPE systems/frameworks and potential new ALPS observation types or deployments, keeping in mind of the overall goal to improve the estimation of ocean state in ocean-sea ice models.

Figure 2. (a) Large misfits in salinity between a data constrained DA system and Argo float data over depth range 300–2000 m (subfigure adapted from Turpin et al. (2016)), (b) number of observations (upper) and impact of observations on the total adjustment (lower) during a 7-day assimilation in a California Current ocean data assimilation framework (subfigure adapted from Moore et al. (2011c)), (c) sensitivity of box B mean temperature at depth ~150 m to ocean salinity at depth 125 m up- and down-stream. Figure adapted from Nguyen et al. (2017)

Model Drift. Oke et al. (2015) reported that when Argo data are not used to tightly constrain ocean DA systems, model trajectories diverge quickly from observations within a few months. In energetic regions, the degradation can occur within days (Janekovic et al., 2013). Model drift arises from various sources, including imperfect model physics, model representation errors, model structural uncertainty, and often sensitive yet highly unconstrained model parameters (Mignac et al., 2015; Oke et al., 2015; Stammer et al., 2016). Large model-data misfits persist in regions where mesoscale to submesoscale eddy activity dominates, for example, along energetic western boundary currents, in the Antarctic Circumpolar Current (Figure 2a; Forget et al., 2015a; Turpin et al., 2016; Sivareddy et al., 2017), or along coastal regions where temporal and spatial decorrelation length scale are short (Moore et al., 2011c; Janekovic et al., 2013). In the DA framework, model drift can often be mitigated by adjusting the assimilation window. This window length often depends on how long nonlinear processes will “overwrite” the initial condition and the model trajectory becomes unpredictable (Moore et al., 2011a; Janekovic et al., 2013). In these regions, increased ALPS spatial coverage and temporal sampling rate help improve the estimations of initial condition (primary task of most DA systems), representation errors, and the time-mean and time-varying model internal parameters in SPE framework. The DA framework can also be used as a quantitative tool for assessing model error by understanding the causes of recurring analysis increments (e.g., Rodwell and Palmer, 2007).

Measurement Redundancy. While some of the challenges of DA/SPE systems’ ability to represent the realistic ocean state are computational in nature, for example, model resolution and associated representation errors, the majority can be traced back to lack of appropriate observations to constrain unknown parameters/processes and their error covariance (Moore et al., 2011b). The notion of having already “enough” data of “global” coverage should be critically assessed. The SPE framework can be used to address the issue of over- and undersampling and sampling redundancy. As an example, Moore et al. (2011c) showed that depending on the region and scientific objective, data sets with orders of magnitude more data and good spatial coverage can have up to 90% redundancy (i.e., the first few measurements or only measurements in independent “super sites” contribute to improving the state estimate while the rest did not provide additional information; Figure 2b). Their study also highlighted the importance of very few observations with independent information in inaccessible sites, for example, subsurface or coastal, that can significantly impact model’s adjustments. Additional studies (e.g., Köhl and Stammer, 2004; Heimbach et al., 2011; Nguyen et al., 2017) show strategically positioned observations that take into account upstream and downstream ocean dynamics in delivering integrated information can be more effective than uniform coverage and a high quantity of observations at the site of interest (Figure 2c).

Data Type. One of the primary goals of SPE is estimating model internal parameters, such as ocean mixing (Stammer et al., 2016). Such parameters are often not easily observed and must be indirectly inferred from observations. Indeed, model drift is found to be largely a consequence of variations in these unconstrained yet highly sensitive model parameters. Ocean mixing rates in the lower latitudes (±60°N) have recently been calculated from Argo float temperature/salinity (e.g., Whalen et al., 2012; Cole et al., 2015) and should be used to directly constrain model parameters. Observational challenges remain in the deep ocean, in vigorous currents and in ice-covered regions. Thus, extending ALPS measurements to the deep ocean below 2,000 m in the lower latitudes and throughout the water column at high latitudes will help constrain and improve these parameter estimates.

Uncertainties. Both DA and SPE systems require knowledge of model representation error. “Representation error” here refers to what processes the model is able to resolve (represent) given its horizontal and vertical resolution, compared to point-wise measurement taken by in situ systems. Rigorous quantification of this error requires dense coverage of observations that can capture the spatial and temporal variability of the targeted model tracer, velocity, or parameters, but is often out of reach in practice. At a nominal spacing of 3° x 3° global coverage, Argo floats capture variability at a resolution that is inadequate for regions where the first baroclinic Rossby radius of deformation is below the floats’ spatial sampling (e.g., Figure 2a).



ALPS data have greatly advanced the quality of DA/SPE frameworks over the last decade. Challenges remain in model structural errors and representation errors. An increased sampling by and systematic use of ALPS data within DA/SPE frameworks may help improve understanding and better addressing such error sources in both operational forecasting and climate estimation. In DA frameworks that rely on statistical approaches and do not obey the underlying physics, spurious “signals” may arise due to over-constraining of/over-reliance on data (Sivareddy et al., 2017). In this framework, continuous observations in space and time would be valuable to mitigate and/or damp out propagations of these spurious signals.

In terms of sustained, quasi-global hydrographic sampling of the top 2,000 m of the water column, the Argo float network has become a primary ALPS platform. Its present coverage, at a minimum, is deemed critical to support short-term operational and long-term research-focused global DA and SPE systems. DA and SPE tools can be deployed for numerical observing system experiments (OSEs) and observing system simulation experiments (OSSEs) to assess or identify data sets and/or locations of data redundancy and those which have optimal impact on the system (Köhl and Stammer, 2004; Heimbach et al., 2011). ALPS data show promise to fill the gaps required by OSE/OSSEs. The shortcomings of ocean models to capture first order ocean dynamics in energetic regions and in polar regions (Ilicak et al., 2016) are likely systematic deficiencies. Studies such as Moore et al. (2011c), Köhl and Stammer (2004), and Nguyen et al. (2017) should be conducted, with targeted metrics, to methodologically address the potential impact current and future observations have to better understand and tackle model errors (Rodwell and Palmer, 2007; Moore et al., 2011c). In the same vain, tools available within DA/SPE frameworks should be used more widely to guide the deployment of new ALPS instruments at locations that can maximize their contributions to improved ocean-sea ice state and parameter estimation.



Cole, S., C. Wortham, E. Kunze, and W. Owens. 2015. Eddy stirring and horizontal diffusivity from Argo float observations: Geographic and depth variability. Geophysical Research Letters 42:3,989–3,997,​10.1002/2015GL063827.

Edwards, C., A. Moore, I. Hoteit, and B. Cornuelle. 2015. Regional ocean data assimilation. Annual Review of Marine Science 7:21–42,

Forget, G., J.M. Campin, P. Heimbach, C.N. Hill, R.M. Ponte, and C. Wunsch. 2015a. ECCO version 4: An integrated framework for non-linear inverse modeling and global ocean state estimation. Geoscientific Model Development 8:3,071–3,104,

Forget, G., D. Ferreira, and X. Liang. 2015b. On the observability of turbulent transport rates by Argo: Supporting evidence from an inversion experiment. Ocean Science 11:839–853,

Heimbach, P., C. Wunsch, R.M. Ponte, G. Forget, C. Hill, and J. Utke. 2011. Timescales and regions of the sensitivity of Atlantic meridional volume and heat transport: Toward observing system design. Deep Sea Research Part II 58:1,858–1,879,

Ilicak, M., H. Drange, Q. Wang, R. Gerdes, Y. Aksenov, D. Bailey, M. Bentsen, A. Biastoch, A. Bozec, C. Böning, and others. 2016. An assessment of the Arctic Ocean hydrography in a suite of interannual CORE-II simulations. Part III: Hydrography and fluxes. Ocean Modelling 100:141–161,​j.ocemod.2016.02.004.

Janekovic, I., B.S. Powell, D. Matthews, M.A. McManus, and J. Sevadjian. 2013. 4D-Var data assimilation in a nested, coastal ocean model: A Hawaiian case study. Journal of Geophysical Research 118:5,022–5,035,

Köhl, A., and D. Stammer. 2004. Optimal observations for variational data assimilation. Journal of Physical Oceanography 34:529–542,​10.1175/2513.1.

Liu, C., A. Köhl, and D. Stammer. 2012. Adjoint-based estimation of eddy-​induced tracer mixing parameters in the global ocean. Journal of Physical Oceanography 42:1,186–1,206,

Martin, M.J., M. Balmaseda, L. Bertino, P. Brasseur, G. Brassington, J. Cummings, Y. Fujii, D.J. Lea, J.M. Lellouche, K. Mogensen, and others. 2015. Status and future of data assimilation in operational oceanography. Journal of Operational Oceanography 8:s28–s48,

Mignac, D., C. Tanajura, A. Santana, L. Lima, and J. Xie. 2015. Argo data assimilation into HYCOM with an EnOI method in the Atlantic Ocean. Ocean Science 11:195–213,

Moore, A.M., H.G. Arango, G. Broquet, B.S. Powell, A.T. Weaver, and J. Zavala-Garay. 2011a. The Regional Ocean Modeling System (ROMS) 4-dimensional variational data assimilation systems: Part I – System overview and formulation. Progress in Oceanography 91:34–49,

Moore, A.M., H.G. Arango, G. Broquet, C. Edwards, M. Veneziani, B. Powell, D. Foley, J.D. Doyle, D. Costa, and P. Robinson. 2011b. The Regional Ocean Modeling System (ROMS) 4-dimensional variational data assimilation systems: Part II – Performance and application to the California Current System. Progress in Oceanography 91:50–73,

Moore, A.M., H.G. Arango, G. Broquet, C. Edwards, M. Veneziani, B. Powell, D. Foley, J.D. Doyle, D. Costa, and P. Robinson. 2011c. The Regional Ocean Modeling System (ROMS) 4-dimensional variational data assimilation systems: Part III – Observation impact and observation sensitivity in the California Current System. Progress in Oceanography 91:74–94,

Nguyen, A., V. Ocaña, V. Garg, P. Heimbach, J. Toole, R. Krishfield, C. Lee, and R. Rainville. 2017. On the benefit of current and future ALPS data for improving Arctic coupled ocean-sea ice state estimation. Oceanography 30(2):69–73,

Oke, P., G. Larnicol, Y. Fujii, G. Smith, D. Lea, S. Guinehut, E. Remy, M.A. Balmaseda, T. Rykova, D. Surcel-Colan, and others. 2015. Assessing the impact of observations on ocean forecasts and reanalyses: Part 1, Global studies. Journal of Operational Oceanography 8:49–62,

Poje, A., T. Oezgoekmen, B. Lipphardt, B. Haus, E. Ryan, A. Haza, G. Jacobs, A. Reniers, M. Olascoaga, G. Novelli, and others. 2014. Submesoscale dispersion in the vicinity of the Deepwater Horizon spill. Proceedings of the National Academy of Sciences of the United States of America 111:12,693–12,698,

Robinson, A., P. Haley, P. Lermusiaux, and W. Leslie. 2002. Predictive skill, predictive capability and predictability in ocean forecasting. OCEANS ’02 MTS/IEEE 2:787–794,

Rodwell, M.J., and T.N. Palmer. 2007. Using numerical weather prediction to assess climate models. Quarterly Journal of the Royal Meteorological Society 133:129–146,

Sivareddy, S., A. Paul, T. Sluka, M. Ravichandran, and E. Kalnay. 2017. The pre-Argo ocean reanalyses may be seriously affected by the spatial coverage of moored buoys. Scientific Reports 7,

Stammer, D. 2005. Adjusting internal model errors through ocean state estimation. Journal of Physical Oceanography 35:1,143–1,153,

Stammer, D., M. Balmaseda, P. Heimbach, A. Köhl, and A. Weaver. 2016. Ocean data assimilation in support of climate applications: Status and perspectives. Annual Review of Marine Science 8:491–518,

Todd, R., D. Rudnick, M. Mazloff, R. Davis, and B. Cornuelle. 2011. Poleward flows in the southern California Current System: Glider observations and numerical simulation. Journal of Geophysical Research, 116, C02026,​10.1029/2010JC006536.

Turpin, V., E. Remy, and P.L. Traon. 2016. How essential are Argo observations to constrain a global ocean data assimilation system? Ocean Science 12:257–274,

Whalen, C.B., L.D. Talley, and J.A. MacKinnon. 2012. Spatial and temporal variability of global ocean mixing inferred from Argo profiles. Geophysical Research Letters 39, L18612,

Wunsch, C., and P. Heimbach. 2013. Dynamically and kinematically consistent global ocean circulation and ice state estimates. Pp. 553–579 in Ocean Circulation and Climate: A 21st Century Perspective. Elsevier Ltd.,

Wunsch, C., P. Heimbach, R. Ponte, and I. Fukumori. 2009. The global general circulation of the ocean estimated by the ECCO-Consortium. Oceanography 22(2):88–103,

Wunsch, C., and D. Stammer. 1998. Satellite altimetry, the marine geoid, and the oceanic general circulation. Annual Review of Earth and Planetary Sciences 26:219–253,



An T. Nguyen, Institute for Computational Engineering and Sciences, University of Texas at Austin, TX, USA,

Patrick Heimbach, Institute for Computational Engineering and Sciences, Institute for Geophysics, and Jackson School of Geosciences, University of Texas at Austin, TX, USA,