Bayes Factors

Bayes Factors#

Bayes factors are central quantities in both Bayesian hypothesis testing and Bayesian model comparison. A Bayes factor allows to assess which of two hypotheses or two competing models is in favour of the other.

Bayes Factors in Bayesian Hypothesis Testing#

In the situation of classical hypothesis testing, there is one unknown parameter \(\theta\in\Theta\) lying in a given set \(\Theta\). The set \(\Theta\) can be decomposed as \(\Theta=\Theta_0\cup\Theta_1\) such that \(\Theta_0\cap\Theta_1=\emptyset\). The null hypothesis \(H_0\) states that \(H_0:~\theta\in\Theta_0\) and the alternative hypothesis is given by \(H_1:~\theta\in\Theta_1\). In a Bayesian setting, unknown parameters are assumed to follow a distribution. Therefore, probabilities are assigned to the hypotheses given by the prior likelihoods

\[ \begin{align*} \pi_0=\mathbb{P}(\theta\in\Theta_0),\qquad \pi_1=\mathbb{P}(\theta\in\Theta_1), \end{align*} \]

such that \(\pi_0+\pi_1=1\). Taking into account a data set \(x\), the corresponding posterior likelihoods are defined by

\[ \begin{align*} p_0=\mathbb{P}(\theta\in\Theta_0|x),\qquad p_1=\mathbb{P}(\theta\in\Theta_1|x), \end{align*} \]

where again \(p_0+p_1=1\). The definition of the posterior likelihoods is based on Bayes’ theorem. The Bayes factor \(B\) in favour of \(H_0\) against \(H_1\) is then defined by

(11)#\[ \begin{align} B=\frac{p_0/p_1}{\pi_0/\pi_1}=\frac{p_0\pi_1}{p_1\pi_0}. \end{align} \]

Thus, a Bayes factor represents the ratio of the posterior odds \(p_0/p_1\) and the prior odds \(\pi_0/\pi_1\). It reflects the evidence in the data in favour of \(H_0\) as opposed to \(H_1\). By substituting the above definitions into Equation (11), we obtain by applying Bayes’ theorem

(12)#\[ \begin{align} B=\frac{p_0/p_1}{\pi_0/\pi_1}=\frac{\mathbb{P}(\theta\in\Theta_0|x)/\mathbb{P}(\theta\in\theta_0)}{\mathbb{P}(\theta\in\Theta_1|x)/\mathbb{P}(\theta\in\theta_1)} =\frac{\mathbb{P}(x|\theta\in\Theta_0)}{\mathbb{P}(x|\theta\in\Theta_1)} =\frac{\pi(x|\theta\in\Theta_0)}{\pi(x|\theta\in\Theta_1)}, \end{align} \]

where \(\pi(x|\theta\in\Theta_i)=\mathbb{P}(x|\theta\in\Theta_i)\), for \(i\in\{0,1\}\), denotes the likelihood of alternative \(i\).

The informative value of \(B\) clearly depends on its magnitude. The following table provides established thresholds to interpret a concrete value of \(B\) [Kass and Raftery, 1995]:

\(B\)	\(2\ln(B)\)	Evidence against \(H_1\)
\([1,3]\)	\([0,2]\)	not worth more than a bare mention
\((2,6]\)	\((3,20]\)	positive
\((6,10]\)	\((20,150]\)	strong
\(>10\)	\(>150\)	very strong

The application of Bayes factors has several advantages compared with the approach of classical hypothesis testing. Instead of considering point estimates, parameter values are assumed to underlie a dristibution and a Bayes factor accounts for the whole distribution of the parameter space. Moreover, a Bayes factor is able to provide evidence in favour of \(H_0\) as well as against \(H_0\). However, by definition, Bayes factors only serve as a quantity for pairwise comparisons of alternatives. Besides, as for Bayesian methods in general, Bayes factors can be sensitive to the choice of prior distributions [Morey et al., 2016].

The concrete form of \(\Theta_0\) and \(\Theta_1\) affects the corresponding Bayes factor and the evaluation of a hypothesis test. A detailed account on this topic can be found in [Lee, 2012].

Bayes Factors for Model Selection#

Bayes factors cannot only be used to assess the plausibility of parameter values for a given model (or, e.g., for a distribution), but also for competing inherently different models [Hug, 2014]. Therefore, the focus is now on selecting one model among competing models \(M_1,\dots,M_m\) parameterized by parameters \(\theta_i\in\Theta_i\), for \(i=1,\dots,m\) that best explains the data. Analogous to Equation (12), the Bayes factor in favour of \(M_k\) over \(M_l\) is defined by

\[ \begin{align*} B_{kl}=\frac{\pi(x|M_k)}{\pi(x|M_l)},\quad k,l\in\{1,\dots,m\}. \end{align*} \]

The quantity \(\pi(x|M_j)\), for \(j\in\{1,\dots,m\}\), is called the marginal likelihood of model \(M_j\) that can be written as

(13)#\[ \begin{align} \pi(x|M_j)=\int_{\Theta_j}\pi(x|\theta_j, M_j)\pi(\theta_j|M_j)\mathrm{d}\theta_j. \end{align} \]

Equation (13) reflects that the marginal likelihood \(\pi(x|M_j)\) defines the likelihood of \(x\) for model \(M_j\) where the impact of the parameter \(\theta_j\) is marginalized (integrated) out. More formally, the marginal likelihood for model \(M_j\) is defined as the integral over the parameter space \(\Theta_j\) of the likelihood of \(M_j\) times the prior distribution of \(M_j\).

In order to determine Bayes factors, the integrals from Equation (13) need to be computed for the models of interest. Since the integral for model \(M_j\) is defined over the whole parameter space \(\Theta_j\), an analytic solution is, in practice, not feasible in many cases. However, various estimation techniques have been developed to address this issue. Different procedures are reviewed in [Friel and Wyse, 2012]. In the last decade, the thermodynamic integration approach has been established as the state-of-the-art method to approximate marginal likelihoods. Its application goes back to [Friel and Pettitt, 2008, Lartillot and Philippe, 2006].

In contrast to this article of a Bayesian model selection method, the article model selection provides an introduction to model selection tools for frequentist statistics.

References#

FP08: Nial Friel and Anthony N. Pettitt. Marginal likelihood estimation via power posteriors. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 70:589–607, 2008.
FW12: Nial Friel and Jason Wyse. Estimating the evidence – a review. Statistica Neerlandica, 66:288–308, 11 2012.
Hug14: Sabine C. Hug. From low-dimensional model selection to high-dimensional inference: tailoring Bayesian methods to biological dynamical systems. TU München (PhD Thesis), 2014.
KR95: Robert E. Kass and Adrian E. Raftery. Bayes factors. Journal of the American Statistical Association, 90:773–795, 1995.
LP06: Nicolas Lartillot and Hervé Philippe. Computing Bayes factors using thermodynamic integration. Systematic Biology, 55:195–207, 4 2006.
Lee12: Peter M. Lee. Bayesian Statistics: An Introduction. Wiley, 4 edition, 2012.
MRR16: Richard D. Morey, Jan Willem Romeijn, and Jeffrey N. Rouder. The philosophy of Bayes factors and the quantification of statistical evidence. Journal of Mathematical Psychology, 72:6–18, 2016.

Authors#

Julian Wäsche

Contributors#

Tamadur Albaraghtheh