---
title: "Models"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Models}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{=tex}
\newcommand*{\INT}{${\tiny INT}$}
\newcommand*{\diffdchar}{\mathrm{d}}
\newcommand*{\dd}{\mathop{\diffdchar\!}}
\newcommand*{\Cint}{C^{\mbox{\INT}}}
\newcommand*{\z}{z_w}
\newcommand*{\tz}{t^z}
```
This document describes the statistical models used in **morse** to
analyze survival and reproduction data, and as such serves as a
mathematical specification of the package. For a more practical
introduction, please consult the **Tutorial** vignette ; for information
on the structure and contents of the library, please consult the
reference manual.
Model parameters are estimated using Bayesian inference, where posterior
distributions are computed from the likelihood of observed data combined
with prior distributions on the parameters. These priors are specified
after each model description.
# Survival toxicity tests
In a survival toxicity test, subjects are exposed to a measured
concentration of a contaminant over a given period of time and the
number of surviving organisms is measured at certain time points during
exposure. In most standard toxicity tests, the concentration is held
constant throughout the whole experiment, which is assumed for
**Analysis of target time survival toxicity tests** in `morseDR` package,
but not for
**Toxicokinetic-Toxicodynamic modeling** which can handled time variable
exposure. In the case of constant exposure, an experiment is generally
replicated several times and also repeated for various levels of the
contaminant. For time-variable exposure, a profile of exposure is
usually unique, and the experiment is repeated with several profiles of
exposures.
## Toxicokinetic-Toxicodynamic modeling
For datasets featuring time series measurements, more complete models
can be used to estimate the effect of a contaminant on survival. We
assume the toxicity test consists in exposing an initial number $n_i^0$
of organisms to a concentration $c_i(t)$ of contaminant (constant or
time-variable), and following the number $n_i^k$ of survivors at time
$t_k$ (with $t_0 < t_1 < \dots < t_m$ and $t_0 = 0$), thus providing a
collection $D = {(c_i, t_k, n_i^k)}_{i,k}$ of experiments. In 'morse',
we propose two Toxicokinetic-Toxicodynamic (TKTD) models belonging to
the General Unified Threshold model for Survival (GUTS) [@jager2011;
@Jager2018GUTSbook]. One is known as the *reduced stochastic death*
model [@nyman2012] or GUTS-SD and the other is the *reduced organism
tolerance* model or GUTS-IT, which we describe now.
| Parameters | Symbols | Alternative symbols | Units | Models |
|:-----------------------|:-----------|------------|------------|------------|
| Background hazard rate | $h_b$ | $m_0$ | $\text{time}^{-1}$ | SD and IT |
| Dominant toxicokinetic rate constant | $k_d$ | $\mbox{NEC}$ | $\text{time}^{-1}$ | SD and IT |
| Threshold for effects | $z_w$ | $m_0$ | $[D]$ | SD |
| Killing rate constant | $b_w$ | $k_k$ | $[D]^{-1}$ | SD |
| Median of the threshold effect distribution | $m_w$ | $\alpha$ | $[D]$ | IT |
| Shape of the threshold effect distribution | $\beta$ | $-$ | $n.d.$ | IT |
: *Table: Parameters and symbols used for GUTS-SD and GUTS-IT models.
Alternative symbols are used within pubications (see for instance
[@jager2011; @delignette2017; @Jager2018GUTSbook]. The unit* $[D]$
refers to unit of actual damage, $n.d$ for non dimensional. For GUTS-IT
model, we assume a log-logistic distributions, but other distributions
are occasionally used [@albert2016].
#### GUTS Modelling
The number of survivors at time $t_k$ given the number of survivors at
time $t_{k-1}$ is assumed to follow a binomial distribution: $$
N_i^k \sim \mathcal{B}(n_i^{k-1}, f_i(t_{k-1}, t_k))
$$ where $f_i$ is the conditional probability of survival at time $t_k$
given survival at time $t_{k-1}$ under concentration $c_i(t)$. Denoting
$S_i(t)$ the probability of survival at time $t$, we have: $$
f_i(t_{k-1}, t_k) = \frac{S_i(t_k)}{S_i(t_{k-1})}
$$
The formulation of the survival probability $S_i(t)$ in GUTS
[@jager2011] is given by integrating the *instantaneous mortality rate*
$h_i$: $$
S_i(t) = \exp \left( \int_0^t - h_i(u)\mbox{d}u \right)
\tag{2}
$$
In the model, function $h_i$ is expressed using the internal
concentration of contaminant (that is, the concentration inside an
organism) $C^{{\scriptsize INT}}_i(t)$. More precisely: $$
h_i(t) = b_w \max(C^{\mbox{${\tiny INT}$}}_i(t) - z_w, 0) + h_b
$$ where (see Table of parameters):
- $b_w$ is the \emph{killing rate} and expressed in
concentration$^{-1}$.time$^{-1}$ ;
- $z_w$ is the so-called \emph{no effect concentration} and represents
a concentration threshold under which the contaminant has no effect
on organisms ;
- $h_b$ is the \emph{background mortality} (mortality in absence of
contaminant), expressed in time$^{-1}$. \\end{itemize}
The internal concentration is assumed to be driven by the external
concentration, following:
$$
\frac{\mathop{\mathrm{d}\!}C^{\mbox{${\tiny INT}$}}_i}{\mathop{\mathrm{d}\!}t}(t) = k_d (c_i(t) - C^{\mbox{${\tiny INT}$}}_i(t))
\tag{1}
$$
We call parameter $k_d$ of Eq.(1) the *dominant rate constant*
(expressed in time$^{-1}$). It represents the speed at which the
internal concentration in contaminant converges to the external
concentration. The model could be equivalently written using an internal
damage instead of an internal concentration as a dose metric
[@jager2011].
If we denote $f_z(z_w)$ the probability distribution of the no effect
concentration threshold, $z_w$, then the survival function is given by:
$$
S(t) = \int_0^t S_i(t) f_z(z_w) \mbox{d} z_w= \int \exp \left( \int_0^t - h_i(u)\mbox{d} u \right) f_z(z_w) \mbox{d} z_w
$$
Then, the calculation of $S(t)$ depends on the model of survival,
GUTS-SD or GUTS-IT [@jager2011].
#### GUTS-SD
In GUTS-SD, all organisms are assumed to have the same internal
concentration threshold (denoted $z_w$), and, once exceeded, the
instantaneous probability to die increases linearly with the internal
concentration. In this situation, $f_z(z_w)$ is a Dirac delta
distribution, and the survival rate is given by Eq.(2).
#### GUTS-IT
In GUTS-IT, the threshold concentration is distributed among all the
organisms, and once exceeded for one organism, this organism dies
immediately. In other words, the killing rate is infinitely high (e.g.
$k_k = + \infty$), and the survival rate is given by: $$
S_i(t) = e^{-h_b t} \int_{\max\limits_{0<\tau z_w$, it takes time $t^z_i$ before the internal
concentration reaches $z_w$, where: $$
t^z_i = - \frac{1}{k_d} \log \left(1 - \frac{z_w}{c_i} \right).
$$ Before that happens, Eq.(3) applies, while for $t > t^z_i$,
integrating Eq.(2) results in: $$
S_i(t) = \exp \left(- h_b t - b_w(c_i - z_w) (t - t^z_i) - \frac{b_w c_i}{k_d} \left(e^{- k_d t} - e^{-k_d t^z_i} \right) \right)
$$
In brief, given values for the four parameters $h_b$, $b_w$, $k_d$ and
$z_w$, we can simulate trajectories by using $S_i(t)$ to compute
conditional survival probabilities. In 'morse', those parameters are
estimated using Bayesian inference. The choice of priors is defined
hereafter.
#### GUTS-IT
With constant concentration, Eq.(4) provides that
$C^{\mbox{${\tiny INT}$}}_i(t)$ is an increasing function, meaning that:
$$
\max\limits_{0 < \tau < t} (C^{\mbox{${\tiny INT}$}}_i(\tau)) = c_i(1 - e^{-k_d t})
$$
Therefore, assuming a log-logistic distribution for $f_z$ yields:
$$
S_i(t) = \exp(- h_b t) \left( 1 - \frac{1}{1+ \left( \frac{c_i(1-\exp(-k_d t ))}{m_w} \right)^{- \beta}} \right)
$$
where $m_w>0$ is the scale parameter (and also the median) and $\beta>0$
is the shape parameter of the log-logistic distribution.
### Inference
Posterior distributions for all parameters $h_b$, $b_w$, $k_d$, $z_w$,
$m_w$ and $\beta$ are computed with JAGS [@rjags2016]. We set prior
distributions on those parameters based on the actual experimental
design used in a toxicity test. For instance, we assume $z_w$ has a high
probability to lie within the range of tested concentrations. For each
parameter $\theta$, we derive in a similar manner a minimum
($\theta^{\min}$) and a maximum ($\theta^{\max}$) value and state that
the prior on $\theta$ is a log-normal distribution [@delignette2017].
More precisely: $$
\log_{10} \theta \sim \mathcal{N}\left(\frac{\log_{10} \theta^{\min} + \log_{10} \theta^{\max}}{2} \, , \,
\frac{\log_{10} \theta^{\max} - \log_{10} \theta^{\min}}{4} \right)
$$ With this choice, $\theta^{\min}$ and $\theta^{\max}$ correspond to
the 2.5 and 97.5 percentiles of the prior distribution on $\theta$. For
each parameter, this gives:
- $z_w^{\min} = \min_{i, c_i \neq 0} c_i$ and
$z_w^{\max} = \max_i c_i$, which amounts to say that $z_w$ is most
probably contained in the range of experimentally tested
concentrations ;
- similarly, $m_w^{\min} = \min_{i, c_i \neq 0} c_i$ and
$m_w^{\max} = \max_i c_i$ ;
- for background mortality rate $h_b$, we assume a maximum value
corresponding to situations where half the indivuals are lost at the
first observation time in the control (time $t_1$), that is: $$
e^{- h_b^{\max} t_1} = 0.5 \Leftrightarrow h_b^{\max} = - \frac{1}{t_1} \log 0.5
$$ To derive a minimum value for $h_b$, we set the maximal survival
probability at the end of the toxicity test in control condition to
0.999, which corresponds to saying that the average lifetime of the
considered species is at most a thousand times longer than the
duration of the experiment. This gives: $$
e^{- h_b^{\min} t_m} = 0.999 \Leftrightarrow h_b^{\min} = - \frac{1}{t_m} \log 0.999
$$
- $k_d$ is the parameter describing the speed at which the internal
concentration of contaminant equilibrates with the external
concentration. We suppose its value is such that the internal
concentration can at most reach 99.9% of the external concentration
before the first time point, implying the maximum value for $k_d$
is: $$
1 - e^{- k_d^{\max} t_1} = 0.999 \Leftrightarrow k_d^{\max} = - \frac{1}{t_1} \log 0.001
$$ For the minimum value, we assume the internal concentration
should at least have risen to 0.1% of the external concentration at
the end of the experiment, which gives: $$
1 - e^{- k_d^{\min} t_m} = 0.001 \Leftrightarrow k_d^{\min} = - \frac{1}{t_m} \log 0.999
$$
- $b_w$ is the parameter relating the internal concentration of
contaminant to the instantaneous mortality. To fix a maximum value,
we state that between the closest two tested concentrations, the
survival probability at the first time point should not be divided
by more than one thousand, assuming (infinitely) fast equilibration
of internal and external concentrations. This last assumption means
we take the limit $k_d \rightarrow + \infty$ and approximate
$S_i(t)$ with $\exp(- (h_b + b_w(c_i - z_w))t)$. Denoting
$\Delta^{\min}$ the minimum difference between two tested
concentrations, we obtain: $$
e^{- b_w^{\max} \Delta^{\min} t_1} = 0.001 \Leftrightarrow b_w^{\max} = - \frac{1}{\Delta^{\min} t_1} \log 0.001
$$ Analogously we set a minimum value for $b_w$ saying that the
survival probability at the last time point for the maximum
concentration should not be higher than 99.9% of what it is for the
minimal tested concentration. For this we assume again
$k_d \rightarrow + \infty$. Denoting $\Delta^{\max}$ the maximum
difference between two tested concentrations, this leads to: $$
e^{- b_w^{\min} \Delta^{\max} t_m} = 0.001 \Leftrightarrow b_w^{\min} = - \frac{1}{\Delta^{\max} t_m} \log 0.999
$$
- for the shape parameter $\beta$, we used a quasi non-informative
log-uniform distribution: $$\log_{10} \beta \sim \mathcal{U}(-2,2)$$