bioIntro.tex 17.8 KB
\section{Biogeochemical models \label{sec:bio-intro}}


In this module, biogeochemical models are implemented,
in a two-way coupled mode.

\subsection{Mathematical formulation}\label{sec:bio-math}

The general structure of a biogeochemical model with
$I$ state variables expressed as ensemble averaged
concentrations is given by the
following set of equations:
\begin{equation}\label{FullSystem}
  \partial_t c_i + \partial_z
\left(m_i c_i - K_V \partial_z c_i\right)
=   P_i(\vec{c}) -D_i(\vec{c}), \;\; i = 1,\ldots,I, \quad j,k = 1,\ldots,3,
\end{equation}
with $c_i$ denoting the concentrations of state variables.
Furthermore, $m_i$ represents the autonomous motion of the ecosystem
component $c_i$ (e.g. sinking or active swimming),
and $K_V$ represents the eddy diffusivity.
The source and sink terms of the ecosystem component
$c_i$ are summarised in $P_i(\vec{c})$ and $D_i(\vec{c})$, respectively.
For three-dimensional models, advection with the flow field
and horizontal advection would have to be accounted for additionally.
In many biogeochemical models, some of the state variables have positive
lower limits. In order to account for this,
we defined all state variables as the difference between
the actual value and their lower limit, such that
(for non-negative state variables only)
the model value $c_i$ represents a concentration of
$c_i+c_i^{\min}$ where $c_i^{\min}$ is the lower limit of $c_i$.

The gradient term on the left hand side of (\ref{FullSystem})
is the total transport,
for which typically surface and bottom boundary conditions
\begin{equation}
K_V\partial_z c_i\big|_{z=\eta} = F^s_i,\qquad
K_V\partial_z c_i\big|_{z=-H} = -F^b_i,
\end{equation}
with surface and bottom fluxes, $F^s_i$ and $F^b_i$, respectively,
are applied.
The right hand side denotes the reaction terms,
which are composed of contributions
$d_{i,j}(\vec{c})$, which represent reactive fluxes from
$c_i$ to $c_j$, and in turn, $p_{i,j}(\vec{c})$ are reactive fluxes from
$c_j$ received by $c_i$:
\begin{equation}\label{eq:am:a}
P_i(\vec{c}) = \sum^I_{j=1} p_{i,j}(\vec{c}), \;\;\;\;
D_i(\vec{c}) = \sum^I_{j=1} d_{i,j}(\vec{c}),
\end{equation}
with $d_{i,j}\geq 0$ for all $i,j$ and $p_{i,j}\geq 0$
for all $i\not= j$.

We basically consider two types of ecosystem models.
In the simple NPZ (nutrient-phytoplankton-zooplankton) type models
all state variables are based on the same measurable unit
such as [mmol N m$^{-3}$] for nitrogen-based models.
In such NPZ models the reactive terms do only exchange mass
between state variables with
\begin{equation}\label{eq:am:symmetry}
p_{i,j}(\vec{c}) = d_{j,i}(\vec{c}), \quad \mbox{ for } i \not= j
\quad\mbox{ and }\quad
p_{i,i}(\vec{c}) = d_{i,i}(\vec{c})=0, \quad \mbox{ for } i = j.
\end{equation}
Neglecting for a moment all transport terms, it is easily seen that this
simple type of
model is conserving mass:
\begin{equation}
\begin{array}{l}
\displaystyle
d_t\left(\sum_{i=1}^I c_i    \right) =
\sum_{i=1}^I\left( P_i(\vec{c})- D_i(\vec{c})\right) =
\displaystyle
\sum_{i=1}^I\sum_{j=1}^I\left(p_{i,j}(\vec{c}) - d_{i,j}(\vec{c})\right)
= \sum_{i=1}^I\left(p_{i,i}(\vec{c}) - d_{i,i}(\vec{c})\right)=0.
\end{array}
\end{equation}
The NPZD model (see section \ref{sec:bio-npzd}) and the
\cite{Fashametal1990} model discussed in section \ref{sec:bio-fasham}
are such fully conservative models.

In many biogeochemical models most state variables are known to be positive
or at least non-negative quantities. For non-negative initial conditions
$c_i(0) \geq 0$ one can easily show by a simple
contradiction argument that the condition
\begin{equation}\label{eq:am:d1}
d_{i,j}(\vec{c}) \longrightarrow 0 \;\; \mbox{for} \;\; c_i \longrightarrow 0
\end{equation}
guarantees that the quantities
$c_i(t) \geq 0$, remain non-negative for all $t$.
A typical example is $d_{i,j}(\vec{c}) = f(\vec{c}) c_i$ with a non-negative,
bounded
function $f$ which might depend on all $c_i$.

However, for many applications such simple models are
too restrictive. Often different spatial references
are involved for the state variables, such as the
detritus concentration in the water column, measured in [mmol N m$^{-3}$]
and the fluff layer concentration at the bed, measured in [mmol N m$^{-2}$].
Many biogeochemical processes
involve more than two substances such as the photosynthesis
where different nutrients (e.g.\ nitrate and phosphorus) are taken
up by phytoplankta and oxygen is produced. The ratios between these
substances dissipated or produced are usually fixed, in the example
of photosynthesis uptake of 16 mmol m$^{-3}$ nitrate is
connected to an uptake of 1 mmol m$^{-3}$ phosphorus
and a production of 8.125 mmol m$^{-3}$ oxygen.

For state variables which may
be negative (such as oxygen concentration
which also includes oxygen demand units,
all sink and source
terms are added up in the production terms
$p_{i,j}$, with a negative sign for the sink terms.
For the \cite{Neumannetal2002} model discussed in sections
\ref{sec:bio-iow},
further deviations from the conservation formulation are introduced since
biogeochemical reactions include substances which are not budgeted by the
model (mostly because they are assumed to be not limiting).
One typical example is nitrogen fixated by blue-green algae which
builds up biomass by using atmospheric nitrogen which is later recycled to
nitrate. Such non-conservative terms are lumped into the diagonal terms
$p_{i,i}$ and $d_{i,i}$.

\subsection{Numerical aspects}\label{sec:bio-num}

Two basic aspects which are included in the mathematical formulation for
the biogeochemical equations discussed in section \ref{sec:bio-math}
are to be reproduced by the numerical methods applied: conservation and
positivity. Another constraint for the choice
of numerical methods is that they should be sufficiently stable and accurate.
In order to facilitate this, a split method is applied
separating the numerical solution of the transport part
(advection, diffusion) and the reaction part.
By doing so, we take splitting errors into account which should however
be not significant as long as the typical reaction time scales are
much longer than the constant model time step $\Delta t$.

In the transport step
in which the right hand side is set to zero,
finite volume discretisations
are used such that conservation of mass is guaranteed.
The spatial discretisation is carried out by separating the water column
into $N$ not necessarily equidistant intervals of height $h_k$.
The state variables, represented by layer-averaged values,
are located in the centres of these intervals, the advective and diffusive
fluxes are located at the interfaces in between.
The transport step itself is subject to operator splitting.
The autonomous motion of the state variables (including sinking or
rising due to negative or positive buoyancy, respectively)
is discretised by means of
TVD (Total Variation
Diminishing) advection schemes, for which several choices are
available, see \cite{Pietrzak98}.
These TVD schemes are positivity conserving due to their TVD
property.
The most accurate among those
schemes is the so-called
PDM-limited P$_2$ scheme which has been described in detail
by \cite{Leonhard91}.

For the diffusion, a
central in space scheme is used which is slightly biased towards
a backward in time scheme in order to avoid asymptotic instability
(see \cite{Samarskij84}). By doing so, positivity is obtained and the schemes
are practically second order in time and space.

With the discretisations of the transport terms given above,
accuracy, positivity and conservation of the state variables are
guaranteed by means of standard schemes. For the reaction terms,
\cite{Burchardetal2003b} recently developed schemes which also fulfil these
requirements.
Due to the operator split between transport and reaction terms,
only ordinary differential equations (ODEs)
have to be treated numerically for the latter terms.
For the case of conservative biogeochemical models with
$p_{i,i}=d_{i,i}=0$, these schemes are identical to those given
by \cite{Burchardetal2003b}.
For $p_{i,i}\not= 0$, some modifications are necessary.
Three classes of schemes are considered:
Explicit schemes such as the Euler-forward scheme and second- and
fourth-order Runge-Kutta schemes (see section \ref{sec:ode-solver}).
These schemes are known to be conservative, but for
sufficiently large time steps they may compute negative values
of state variables also for non-negative state variables.
This may be avoided by small time stepping, which however usually
leads to an enormous increase of the computational effort such that
these schemes lose their practical relevance in this context.
In order to solve this problem,
\cite{Patankar80} had suggested the first-order in time
positive definite scheme
(\ref{eq:am:patankar}), and \cite{Burchardetal2003b} have extended this
to second order, see (\ref{eq:am:PRK}).
However, these schemes are not conservative, since
source and sink terms are numerically treated in a different way.
Fully conservative and non-negative schemes in first- and second-oder
in time have thus been suggested and tested for
ordinary differential equations by \cite{Burchardetal2003b},
with $p_{i,i}=d_{i,i}=0$ in equations.\ (\ref{eq:am:MP}) and (\ref{eq:am:MPRK})
in section \ref{sec:ode-solver}.
This equal numerical treatment of sources and sinks
results in implicit linear systems of equations.
Since only ordinary differential equations are to be solved in each
grid point, these systems have small dimensions, for example $I=7$ for
the \cite{Fashametal1990} model (see section \ref{sec:bio-fasham}) and
$I=10$ for the \cite{Neumannetal2002} model 
(see section \ref{sec:bio-iow}).
Thus, these linear systems may be directly solved
by Gaussian elimination schemes.
Nevertheless, one can also employ iterative methods.
Especially for the linear system arising in the context of the present
type of equations it is proven in \cite{Burchardetal2003b}
that the involved matrix is always non-singular and the
standard Jacobi-type method converge to the unique solution of the system.
Later, \cite{Bruggemanetal2005} found that the Modified Patankar
schemes as described in equations (\ref{eq:am:MP}) and (\ref{eq:am:MPRK})
are only conservative for systems with one model currency
(e.g.\ nitrogen in the model of \cite{Fashametal1990}),
but do not conserve stoichiometric ratios, when several
limiting nutrients are present. To solve that problem,
\cite{Bruggemanetal2005} developed first- and second-order
Extended Modified Patankar (EMP) schemes, which are
stoiciometrically conservative and explicit, such that they do
not need to solve implicit systems of linear equations. 


\subsection{Computational aspects}\label{sec:bio-comp}

The computational structure of the coupled physical-biogeochemical
model system implemented here
has been designed under consideration of various objectives. 
In this section  
a description of the various design related decisions is given.
The major objectives are:
\begin{itemize}
  \item to provide a well-defined interface between the one-dimensional
   physical model and a sufficiently generic biogeochemical model,
  \item to allow for easy extensions of the system with new 
     biogeochemical models without changing 
  the over-all structure, such as the \cite{Fashametal1990} model
  (see section \ref{sec:bio-fasham}) and the \cite{Neumannetal2002} model, 
  see section \ref{sec:bio-iow}),
  \item to provide a number of solution methods for the 'process part' of the
  biogeochemical model, i.e.\ solvers for ordinary differential equations
  as discussed in section \ref{sec:bio-num} and section \ref{sec:ode-solver},
  \item to obtain fast and efficient execution of the coupled model,
  \item to design the system in such a way that three-dimensional 
   models can easily be interfaced with it.
\end{itemize}
For the biogeochemical model,
we have adopted the same design strategy as has been used for the turbulence 
module of GOTM. The interface between an application using the turbulence
module consists of two subroutine calls only: \emph{init\_turbulence()} and
\emph{do\_turbulence()}. The subroutine \emph{init\_turbulence()} is
responsible for initialising the parameters of the turbulence module and  
called as part of the 
initialisation of the entire model. For the turbulence module the 
initialisation includes reading \emph{namelists} with the model configuration, 
allocating memory for all necessary variables and initialising 
these variables with sensible values. \emph{init\_turbulence()} should only be
called once during program execution and after this call all public and private
variables of the turbulence module should be in a 
consistently initialised state.
During the time integration of the model, \emph{do\_turbulence()} has to be
called at each time step. It is called with a number
of parameters to transfer information e.g.\ from the mean-flow module 
to the turbulence module but also to receive the variables updated
by the turbulence module.

Using the same strategy for the biogeochemical module 
has some problematic implications which are described here.
In analogy to the turbulence
module, the interface is given via the two subroutines \emph{init\_bio()} and 
\emph{do\_bio()}.
The major difference between the turbulence module and the 
biogeochemical module
in terms of implementation
is that in the former the number of variables are known at compilation time
and the dimensions are specified at run time where as in the latter both 
the number of state variables and their dimension are known only at run time.
The general interface has to be able to handle not only the different 
biogeochemical models implemented at present but also to provide a framework
for developing future models. There are two major
items to address: 1.\ how to initialise the biogeochemical model and 2.\ how to 
select the right biogeochemical model during the time integration and use the 
selected ordinary differential equation (ODE) solver.

To solve the initialisation problem we have chosen a two-level initialisation
approach.  At the first level variables not specific to any of the 
biogeochemical 
models are initialised. The single most important variable during this 
phase is \emph{bio\_model}, which contains the identification number for all
implemented biogeochemical models. 
Depending on the value of \emph{bio\_model}, the second level
of initialisation is started. At this level all model specific variables
(such as process rates) are initialised.
The most important variable at the second
level is $I$ (number of state variables). 
After this step, the system returns to the first level, and
now all information is available for allocating memory and initialising 
all variables. The most important data structure provided to the individual 
biogeochemical models will briefly be mentioned here. $c_{i,k}$ 
with $1\leq i\leq I$ and $1\leq k\leq N$ (number of vertical layers)
is a two-dimensional array containing the concentrations of each 
variable at each depth. $I$ is  
supplied by the individual biogeochemical model and $N$
is transferred in the call to \emph{init\_bio()} from the physical model. 

After the initialisation, all variables are initialised in a common
data structure where the only link to the specific model is via 
\emph{bio\_model} and $I$. The next step is to design the actual
time integration in such a way that selected biogeochemical model 
operates on the 
common data structure using the selected ODE solver in a transparent way.

\begin{figure}
\begin{center}
\scalebox{0.5}{\includegraphics{figures/structure_gotmbio.eps}}
\caption{
The structure of the \emph{do\_bio()} subroutine. This subroutine is responsible
for updating all variables in the biogeochemical model in question at each time-step.
\emph{do\_bio()} essentially works as a wrapper around all biogeochemical models
implemented.
The hatched arrows from \emph{process model} to \emph{ODE solvers}
indicate that between one and four calls 
of  \emph{process model}
per time step are performed, depending on the order of the chosen ODE solver.  
}\label{fig_bio_code}
\end{center}
\end{figure}

Figure \ref{fig_bio_code} shows a sketch of how this is organised in the
model source code. 
The sketch should be be read from left to right. At the left side we have 
the interface \emph{do\_bio()}, which is the only connection to the
calling program. The next level shows a sequence of steps necessary to do the
time integration. It should be noted that not all biogeochemical 
models necessarily
have to execute all the steps, some models do e.g.\
not need any surface fluxes or short-wave radiation.
For the diffusion/advection part a general subroutine is called
which is also used by 
the physical model. 
After having calculated $I_{PAR}$ and $B$ (see eqs.\ (\ref{Iz}) and
(\ref{B}),
the next step is the step at which the production and destruction terms
($p_{i,j,k}$ and $d_{i,j,k}$)
of the biogeochemical models are calculated. This is done via a call to 
\emph{ode\_solver()}. After the call to \emph{ode\_solver()}, $c_{i,k}$
has been updated with the new values of all variables in the
biogeochemical model. Which ODE solver to use is determined during the 
initialisation phase (\emph{ode\_method}).
It should be noted
that for some of the solution methods the biogeochemical processes have to 
be evaluated more than once. Instead of having \emph{ode\_solver()} being 
responsible for calling the chosen biogeochemical model,
an additional subroutine
has been introduced: \emph{process\_model}, is a simple
wrapper routine calling the selected biogeochemical model.

The implementation of this biogeochemical module into three-dimensional models
is straight-forward. The 3D model has to take care of storing all
three-dimensional state variables and calculate their advection with the
mean flow and the horizontal diffusion.
Settling, migration, vertical diffusion and the production/destruction
processes are calculated by the biogeochemical module which has to be
called by means of a loop over all horizontal grid boxes of the 3D
model.



This text has been adapted from \cite{Burchardetal05}.