# Bayesian Methods

```{warning}
This article is under construction.
```

The fundamental difference between [frequentist](genindex) and [Bayesian](genindex) statistics is the interpretation of parameters $\theta$. In [frequentist](genindex) statistics, parameters are unknown but fixed values which can be estimated. The term "[frequentist](genindex)" refers to the fact that the [aleatory uncertainty](types_of_uncertainty.md) of such an estimation will decrease with repetition due to the [law of large numbers](central_limit_theorem.md). Here, a probability represents a relative frequency. In [Bayesian](genindex) statistics, on the other side, parameters are random variables, i.e. $\theta \in \Theta$, with their own distributions (here denoted by $\pi(\theta)$). Hence, there is no single value to be found, but a high probability regions to be exploited. Here, a probability reflects the strength of belief in the corresponding event. The key concept of [Bayesian](genindex) thinking is that such beliefs are updated when new information in form of data, $x$, is available via Bayes' theorem:

$$
\pi(\theta|x) = \frac{\pi(x|\theta)\cdot \pi(\theta)}{\underbrace{\int_{\Theta}\pi(x|\theta')\cdot \pi(\theta') \ d\theta'}_{=\pi(x)}} \propto \pi(x|\theta)\cdot \pi(\theta)
$$(eqn:bayes)

The term [Bayesian](genindex) statistics may be seen as a tribute to emphasize the importance of Bayes' theorem. Every component of Equation {eq}`eqn:bayes` is highly important in [Bayesian](genindex) statistics and thus has been coined in the literature:

- $\pi(\theta)$ is the [prior distribution](choice_of_priors) of $\theta$ and should reflect all the available knowledge before seeing the new data $x$.
- $\pi(x|\theta)$ is the likelihood of the data for a particular value for the parameter.
- $\pi(\theta|x)$ is the posterior distribution of $\theta$ which incorporates the new information from $x$. Studying the posterior is often the heart of a [Bayesian](Bayesian) analysis. 
- $\pi(x)$ is the marginal data distribution over the whole parameter space $\Theta$ and often computationally unfeasible. 

If the marginal data distribution cannot be calculated, the posterior must be somehow approximated using the proportionality in Equation {eq}`eqn:bayes`. Two widely-used approaches for this are variational Bayes (i.e. minimizing the distance between a simple function and the posterior) and Markov chain Monte-Carlo (MCMC; i.e. drawing samples whose limit distribution equals the posterior). The UQ dictionary covers two MCMC variants:

- [Hamiltonian Monte Carlo](hamiltonian_mc.md) which makes use of Hamiltonian dynamics to accelerate the convergence of the sample distribution.
- [Transdimensional MCMC](transdimensional_MCMC.md) which uses reversible jumps for model selection.

Many more MCMC variants exist because applicability and convergence time depend on the problem at hand as well as on the model formulation (e.g. whether the model is [parametric](Bayesian_Parametric__inference.md) or [non-parametric](Nonparametric_Bayesian_density_estimation_draft.md)). [Bayesian mixture models](bayesian_mixture_models.md), for instance, offer great flexibility when the data consists of heterogeneous sub-populations, but come with the price of more challenging posterior sampling. Despite the sampling algorithms, other aspects of MCMC have been researched to further boost sampling (e.g. [inclusion of temperature](tempering)). If you are [uncertain about the model](method_induced_uncertainty.md), [Bayes Factors](bayes_factors.md) may come in handy to test hypotheses. 


## Author

Jonas Bauer

## Cotributors

Hendrik Paasche