Artificial intelligence is often described as learning a model from data. That is one important family of approaches, not a universal definition of intelligence. In weather, oceanography, energy systems, industrial processes and many other physical domains, we already possess equations, conservation laws, boundary conditions and decades of scientific knowledge. The real challenge is often to combine that knowledge with observations that are sparse, noisy and incomplete.
01 A forecast can obey every equation and still start from the wrong world
A dynamical model describes how a system evolves from an initial state. If that initial state is wrong, the model can solve its equations perfectly and produce the wrong trajectory. In nonlinear systems such as the atmosphere or the ocean, small errors in the starting state can grow quickly.
Observations do not solve the problem by themselves. Satellites, buoys, radars and sensors measure only selected variables, at selected places and times, with uncertainty. The complete model state may contain millions of values; the observation vector is usually much smaller.
The equations encode structure and physical consistency, yet the starting state and parameters may be uncertain.
Measurements anchor the model in reality, but they do not describe every variable or every point in space and time.
Infer the hidden initial condition, parameters or trajectory that best explains the evidence while respecting the model.
Data assimilation is the mathematical field at that interface. It does not ask the data to replace the model, or the model to ignore the data. It asks for the state that makes the two as compatible as possible.
02 Data assimilation: make the model listen without making it forget
One useful way to understand assimilation is as a controlled negotiation. The model contributes dynamics: which evolutions are physically possible. The observations contribute correction: where the simulated trajectory departs from reality. Statistical assumptions and regularisation decide how strongly each source should be trusted.
A small assimilation laboratory
This deliberately simple illustration blends a model trajectory with observations. The teal analysis moves as the assumed confidence in the observations changes. Real data assimilation uses richer covariance structures, dynamical constraints and optimisation.
The important point is not the weighted average in the toy display. It is the principle: the analysis is constructed from both knowledge and evidence. A good assimilation method must respect the model's time evolution, account for observation error and remain computationally possible at the scale of the system.
03 Optimisation is the bridge between what we know and what we observe
Variational data assimilation turns the reconstruction into an optimisation problem. A cost function measures the discrepancy between the model trajectory and the observations, then adds prior information or regularisation. The unknown initial state and parameters are adjusted until that cost is reduced.
The mathematical details determine how errors are weighted and how uncertainty propagates. The core idea is a constrained search for the trajectory most consistent with both the equations and the measurements.
Simulate the system from the current estimate of the initial state.
Determine how changing the unknowns changes the mismatch.
Update the estimate, run again and continue until an acceptable solution is reached.
This is why continuous optimisation, inverse problems and data assimilation belong together in an AI curriculum. The machine does not merely fit a curve. It solves a structured problem under constraints, with a computational budget and a definition of what counts as an admissible answer.
04 Back and Forth Nudging: correct, reverse, repeat
Jacques Blum and Didier Auroux introduced the Back and Forth Nudging algorithm in 2005. Standard nudging adds a feedback term to the model equations, pulling the simulated state towards observations. Their key move was to apply correction both forwards and backwards across the same assimilation window.
Start from the current estimate and integrate the physical model while nudging its trajectory towards the observations.
Use the corrected final state to integrate back through the same window, with a feedback term of the appropriate sign.
The recovered state at the beginning becomes the next initial estimate. Iterate until the reconstructed trajectory stabilises.
The first paper proved convergence for a linear system. It also made the method attractive in practice: the core formulation does not require the model linearisation, adjoint construction or separate minimisation process used by 4D-Var. Later work developed the theory and tested the approach on Lorenz systems, transport equations, shallow-water and ocean models.
| Method family | Central mechanism | Strength | Engineering challenge |
|---|---|---|---|
| 4D-Var | Minimise a cost over a time window. | Globally structured variational formulation. | Adjoint development and repeated model integrations can be demanding. |
| Kalman / ensemble filters | Alternate forecast and statistical correction. | Explicit treatment of evolving uncertainty. | Covariance propagation or large ensembles can be costly. |
| BFN / DBFN | Alternate forward and backward observers. | Direct feedback, comparatively light implementation and rapid convergence in studied settings. | Backward stability, gain selection and model suitability still require mathematical care. |
The Diffusive Back and Forth Nudging extension was designed for particular diffusive models. In experiments on a two-dimensional shallow-water model and a three-dimensional primitive-equation ocean model, it stabilised backward integration and reduced the impact of noisy observations. That is a research result, not a claim that one algorithm replaces every other method.
05 Two research lives, one culture of applied mathematics
The collaboration is especially powerful because it sits inside much broader research careers. The same mathematical language — partial differential equations, control, optimisation, numerical analysis and inverse problems — travels from plasma physics to ocean circulation, image processing, weather forecasting and industrial modelling.

Pr Jacques Blum
From the École normale supérieure and a doctorate under Jacques-Louis Lions, through CNRS research, professorships at Grenoble, École Polytechnique and Nice, Jacques Blum built a career around the simulation, identification and optimal control of physical systems governed by partial differential equations.
His work spans tokamak plasma equilibrium, real-time reconstruction, ocean circulation and data assimilation. Even the 2017 version of his CV records a research and teaching trajectory of remarkable breadth. At DSTI, he is a member of the Scientific Advisory Board and helped shape the school's approach to mathematical support across the student body.
Pr Didier Auroux
Didier Auroux trained at the École normale supérieure de Lyon, completed a doctorate on data assimilation for environmental problems and an habilitation on fast algorithms for image processing and data assimilation. His research joins geophysics, observers, optimal control, inverse problems, numerical analysis and scientific computing.
He now directs Université Côte d'Azur's Maison de la Modélisation, de la Simulation et des Interactions, a structure that supports research through modelling, simulation, high-performance computing and data science.
06 Excellence matters most when students can reach it
At DSTI, the point of bringing distinguished mathematicians into the classroom is not to decorate a faculty list. It is to let students encounter the habits of mind behind serious modelling: define the state, expose assumptions, formulate the objective, identify what is observable, and understand the numerical consequences.
Jacques Blum teaches the mathematical foundations in the Warm Up of every DSTI data MSc programme, working with cohorts whose prior mathematical preparation can vary widely.
Jacques Blum and Didier Auroux jointly teach the mathematical language needed to reason about data science rather than merely operate its tools.
Students learn how objectives, gradients, constraints and algorithms turn a mathematical problem into a computable solution.
Jacques proposed the creation of DSTI's Support Sessions, in the spirit of the recitation classes used at Ivy League and leading Californian universities. Didier regularly leads support sessions for mathematics-driven modules. The standard is high, and students are given additional structured teaching to help them reach it.
Their research field enters the curriculum directly: reconstructing hidden states and parameters from models and incomplete observations. Explore the curriculum.
Didier teaches Mathematics Harmonisation with Dr Christine Malot, helping students establish a shared mathematical foundation before progressing to later quantitative work. Explore the curriculum.
Jacques teaches the physics component, connecting computation to the physical systems, energy limits and environmental questions it affects. Explore the curriculum.
Jacques and Didier are particularly attached to teaching across the full student population, including learners far from their own research level. That matters. Mathematical confidence is not created by lowering the intellectual ceiling; it is created by building a reliable route towards it.
07 DSTI's position: hybrid intelligence before fashionable uniformity
Do not force the learner to rediscover what the domain already knows.
When reliable physical laws, constraints, taxonomies or relationships exist, represent them. Use data-driven learning for the residual uncertainty, unknown parameters, unresolved scales and patterns the explicit model cannot provide. Intelligence lies in the combination.
Conservation laws, differential equations, causal constraints and domain knowledge are information. Discarding them is not neutrality; it is a design decision.
Data are invaluable where parameters are uncertain, models are incomplete, sub-grid effects are unresolved or patterns cannot be specified analytically.
The difficult work is deciding how model error, observation error and learned components interact — and validating the resulting system.
Data assimilation
Combine a dynamical model with observations so the reconstructed state respects both the evidence and the laws governing evolution.
Semantic Web
Represent known entities and relationships explicitly rather than asking every downstream system to infer them repeatedly from unstructured data.
The analogy is an engineering principle, not a claim that the mathematics is identical. In both cases, explicit knowledge and learning are complementary. Pr Fabien Gandon's teaching of Semantic Web technologies and the data-assimilation work of Jacques Blum and Didier Auroux point towards the same educational discipline: know what you know, learn what you do not, and make the boundary inspectable.
This also changes how efficiency is taught. A smaller, structured method can sometimes be preferable to a larger generic learner: less data movement, less training, stronger physical consistency and a clearer explanation of failure. Sometimes the learned model is the right answer. Sometimes it is one component inside a larger mathematical system.
08 The research trail behind the classroom
The article is grounded in a sequence of publications that traces the work from the introduction of an algorithm and its convergence proof through numerical comparison, theoretical development and geophysical applications.
The founding note introduces BFN and proves convergence for a linear ordinary differential equation system.
A fuller development and numerical study of the method in oceanographic data assimilation.
An extension designed to manage diffusion in backward integration.
Tests on shallow-water and full ocean models, including the method's behaviour with observation noise.
The lesson students should keep
AI is not a single class of models. It is the disciplined construction of systems that infer, optimise and act under uncertainty. Sometimes data should learn the model. Sometimes data should correct the model. Knowing the difference is part of becoming an engineer who understands the scientific foundations.