Experimental Design for a Dynamical Model#

Part of a series: Uncertainty Quantification for a Dynamical Model.

Follow reading here

In many fields of research, possibly even in every field, certain types of experiments are carried out to collect data and to better understand underlying processes. Such experiments, however, usually require a substantial amount of resources. In order to minimize the use of resources and at the same time ensure the measurement of informative observations, an analysis of the experimental design is essential. Typically, a design encompasses different components that can be adjusted by researchers. In systems biology, for example, applicable designs are commonly characterized by the choice of observations, measurement times and external perturbations [Kreutz and Timmer, 2009]. Moreover, randomization and replication play a key role in all experimental setups, not only in a biological context.

This article introduces the design of experiments for a dynamical model described by ordinary differential equations (ODEs). From a theoretical point of view, experimental design analysis may help to overcome parameter identifiability issues. In particular, the uncertainty of inferred model parameters may be reduced by using data obtained by an optimized design. The focus of this article lies on optimality criteria based on the Fisher information matrix of model parameters. Comprehensive overviews of this approach are given in the standard works of [Atkinson et al., 2007, Pukelsheim, 2006].

Within this article, we adopt the notation from the preceding articles. To provide a common basis, we recall that we consider a system of ODEs

\[ \begin{align*} \frac{\mathrm{d} x(t)}{\mathrm{d} t}=f(t,x(t),\theta,u(t)),\quad x(0)=x_0(\theta)\in\mathbb{R}^N, \end{align*} \]

describing the dynamics of a state variable \(x\). For \(l=1,\dots,L\), the observables are assumed to obey

(8)#\[ \begin{align} y_l(t)=g_l(x(t),\theta)+\varepsilon_l(t,\theta),\quad\text{or}\quad y_l(t)=g_l(x(t),\theta)\cdot\varepsilon_l(t,\theta), \end{align} \]

where the random variables \(\varepsilon_l\) represent measurement noise. The \(N_{\theta}\) model parameters are summarized in the parameter vector \(\theta\in\Theta\), where \(\Theta\subset\mathbb{R}^{N_{\theta}}\) denotes the parameter space.

Experimental Design and Parameter Identifiability#

The preceding article of this series sheds light on two notions of parameter identifiability, namely structural and practical identifiability. A parameter is structural non-identifiable if changes of this parameter still yield the same model output. In most cases, a model re-parameterisation can eliminate these non-identifiabilities [Wieland et al., 2021]. So, this phenomenon is usually unrelated to a specific measurement set. But even if a parameter is structurally identifiable, insufficient data might make it difficult to infer the parameter value reliably. In this case, the parameter is called practically non-identifiable. In order to resolve practical non-identifiabilities, the experimental design can be improved to generate more informative observations.

Fisher Information Matrix#

Common experimental design approaches aim to optimize criteria derived from the Fisher information matrix (FIM). In general, the FIM \(I(\beta)\in\mathbb{R}^{N_{\beta}\times N_{\beta}}\) for a vector of unknown parameters \(\beta\in\mathbb{R}^{N_{\beta}}\) of a random variable \(X\) is defined by

(9)#\[ \begin{align} I(\beta)_{ij}=\mathbb{E}\left[\left(\frac{\partial}{\partial\beta_i}\ln f_{\beta}(X)\right)\left(\frac{\partial}{\partial\beta_j}\ln f_{\beta}(X)\right)^{\mathsf{T}}\right], \end{align} \]

where \(I(\beta)=(I(\beta)_{ij})_{i,j=1,\dots,N_{\beta}}\) and \(f_{\beta}\) denotes the density function of \(X\). We omit regularity conditions that are typically imposed on \(f_{\beta}\) to facilitate computations with \(I\) [Czado and Schmidt, 2011]. The Fisher information reflects the information content that an observable random variable \(X\) contains about the unknown parameter vector \(\beta\). This interpretation may become more intuitive when taking into account that the inverse matrix \(I(\beta)^{-1}\) of the FIM defines the covariance matrix of \(\beta\). Thus, \(I(\beta)^{-1}\) can be used to derive confidence intervals of \(\beta\). A narrow confidence interval, however, reveals that an experiment is informative to estimate a parameter value [Faller et al., 2003].

Within this article series, we are mainly interested in design criteria for the dynamical model that is introduced. In this case, the FIM can be concretely formulated in terms of the model components. We assume that we are interested in observations for the observables \(l=1,\dots,L\) measured at time points \(t_1,\dots,t_K>0\). If the observational noise in Equation (8) is assumed to be additive and normally distributed, then the FIM from Equation (9) reduces to

\[ \begin{align*} I(\theta)_{ij}=\sum_{k=1}^K\sum_{l=1}^L\frac{1}{\sigma_{kl}^2}\frac{\partial^2 g_l(x(t_k),\hat{\theta})}{\partial\theta_i\partial\theta_j}, \end{align*} \]

where \(I(\theta)=(I(\theta)_{ij})_{i,j=1,\dots,N_{\theta}}\in\mathbb{R}^{N_{\theta}\times N_{\theta}}\) and \(\sigma_{kl}\) denotes the standard deviation of the normally distributed error. Here, \(\hat{\theta}\) is a given (estimated) vector of parameter values [Kreutz and Timmer, 2009].

Optimality Criteria#

In the previous section, we discussed how the FIM is related to the amount of information of an experiment. Now, we introduce three criteria that measure the information content provided by the Fisher information. Each of them requires that all model parameters contained in \(\theta\) are identifiable. These criteria can be optimized to construct informative designs. We restrict ourselves to the case that we want to optimize the temporal allocation of observations at time points \(t_1,\dots,t_K\). By parameterizing the FIM with respect to the \(t_i\)’s, the objective function can then be written as

\[ \begin{align*} z(t_1,\dots,t_K)=\Phi(I(t_1,\dots,t_K)), \end{align*} \]

where the concrete form of \(\Phi\) depends on the chosen design criterion [Kutalik et al., 2004]. More general approaches also include external stimuli within the optimization that can be controlled by experimenters [Kreutz and Timmer, 2009, Balsa-Canto et al., 2008, Banga and Balsa-Canto, 2008]).

D-Criterion#

The D-criterion aims to maximize the determinant of the FIM, i.e.

\[ \begin{align*} \max_{t_1,\dots,t_K}z(t_1,\dots,t_K)=\max_{t_1,\dots,t_K}\det(I(t_1,\dots,t_K)). \end{align*} \]

A D-optimal design yields parameter estimates whose confidence region exhibits the smallest volume [Balsa-Canto et al., 2008]. Besides, compared with the other criteria, the D-criterion is invariant to the scale of the parameters. So, relatively large quantities have no greater impact on the criterion than others [Atkinson et al., 2007].

E-Criterion#

The E-criterion targets the parameter with maximum error and generates a design that minimizes it. Formally, it means that the minimum eigenvalue of the FIM is maximized, i.e.

\[ \begin{align*} \max_{t_1,\dots,t_K}z(t_1,\dots,t_K)=\max_{t_1,\dots,t_K}\lambda_{\text{min}}(I(t_1,\dots,t_K)), \end{align*} \]

where \(\lambda_{\text{min}}(I(t_1,\dots,t_K))\) denotes the smallest eigenvalue of the FIM [Balsa-Canto et al., 2008].

A-Criterion#

The A-criterion minimizes the average variance of the parameters by maximizing the trace of the FIM, i.e.

\[ \begin{align*} \max_{t_1,\dots,t_K}z(t_1,\dots,t_K)=\max_{t_1,\dots,t_K}\text{tr}(I(t_1,\dots,t_K)). \end{align*} \]

The above stated criteria might be the most prominent design criteria. Extensions and additional criteria can be found, for example, in [Atkinson et al., 2007]. In general, the choice of a specific criterion is problem dependent. It is advisable to take into account several criteria in practice. Furthermore, optimum experimental design can be tackled without using FIM-based criteria. Other approaches use, for example, the mean squared error of a parameter estimator [Chung and Haber, 2012].

Remarks on Numerical Optimisation#

The application of design criteria requires numerical procedures to optimize the objective function. Since a detailed discussion of methods goes beyond the scope of this article, the interested reader is referred, for example, to [Atkinson et al., 2007], [Balsa-Canto et al., 2008] and [Bock et al., 2013]. The FIM depends on the parameters \(\theta\) which are usually not optimized within the design optimization. Instead, the FIM is typically evaluated for concrete values of \(\theta\). However, it requires prior knowledge of parameter values or estimates based on already performed experiments. A common approach to address this issue is a so-called sequential optimal experimental design. It consists of alternately carrying out experiments, parameter estimation and optimum experimental design to construct a design that allows to characterize parameters reliably.

References#

ADT07(1,2,3,4)

Anthony C. Atkinson, Alexander N. Donev, and Randall D. Tobias. Optimum Experimental Designs, with SAS. Oxford University Press, 2007.

BCAB08(1,2,3,4)

Eva Balsa-Canto, Antonio A. Alonso, and Julio R. Banga. Computational procedures for optimal experimental design in biological systems. IET Systems Biology, 2:163–172, 2008.

BBC08

Julio R. Banga and Eva Balsa-Canto. Parameter estimation and optimal experimental design. Essays in Biochemistry, 45:195–209, 2008.

BKS13

Hans Georg Bock, Stefan Körkel, and Johannes P. Schlöder. Parameter estimation and optimum experimental design for differential equation models. Model Based Parameter Estimation : Theory and Applications, 4:1–30, 2013.

CH12

Matthias Chung and Eldad Haber. Experimental design for biological systems. SIAM Journal on Control and Optimization, 50:471–489, 2012.

CS11

Claudia Czado and Thorsten Schmidt. Mathematische Statistik. Springer, 2011.

FKT03

Daniel Faller, Ursula Klingmüller, and Jens Timmer. Simulation methods for optimal experimental design in systems biology. Simulation, 79:717–725, 2003.

KT09(1,2,3)

Clemens Kreutz and Jens Timmer. Systems biology: Experimental design. FEBS Journal, 276:923–942, 2009.

KCGW04

Zoltan Kutalik, Kwang-Hyun Cho, Steve V. Gordon, and Olaf Wolkenhauer. Optimal sampling time selection for parameter estimation in dynamic pathway modelling. Biosystems, 75:43–55, 2004.

Puk06

Friedrich Pukelsheim. Optimal Design of Experiments. Society for Industrial and Applied Mathematics, 2006.

WHR+21

Franz G. Wieland, Adrian L. Hauber, Marcus Rosenblatt, Christian Tönsing, and Jens Timmer. On structural and practical identifiability. Current Opinion in Systems Biology, 25:60–69, 2021.

Authors#

Julian Wäsche

Contributors#

Jonas Bauer