Types of Uncertainty#

This article is part of a series: Data-induced uncertainties. The following aspects are covered

Uncertainty quantification is the field which aims at quantitatively characterizing, propagating and estimating uncertainties in complex systems. Uncertainties can be characterized into two major types:

A General Framework for Uncertainty#

Uncertainty is sometimes assigned to three broad categories: aleatory, epistemic and ontological uncertainty.

Epistemic Uncertainty#

Epistemic uncertainty arises from a lack of knowledge about the system or phenomenon of interest. Includes:

  • Data-induced uncertainty - uncertainty introduced by the decisions made in the selection of data as well as the definition, cleaning, and transformation of the input and output variables.

  • Parameter uncertainty - arises when the model parameter values are specified under imprecise knowledge or lack of direct measures.

  • Method Uncertainty - arises due to the choice of the implementation and computational method used to estimate parameters and/or generate predictions.

  • Structural uncertainty - This can be regarded as a model’s discrepancy or bias due to the fact that the model lacks exact knowledge about the underlying physics. It depends on the model’s ability of representing real world process(es). For example, in early climate/weather models, many real world phenomenon were not modeled due to computational limitations. But as computational resources improved, models included more complex processes and better represented real-world weather and climate.

  • Algorithmic uncertainty - Numerical and/or statistical models become more complex as they become more realistic. To reduce computational cost and complexity, some tradeoffs are made between cost and error and less expensive algorithms are used. This introduces some error in the modelling process. An example is the use of the finite difference method instead of the finite element method to solve partial differential equations. Another example is the use of Monte-Carlo integration methods instead of standard numerical integration methods high dimensional problems.

  • Interpolation uncertainty - Uncertainty which comes from missing data in a model simulation or experiment. The missing data is interploated using some algorithm which can introduce error or noise in the data.

Aleatory Uncertainty#

Aleatory uncertainty is generally thought of as uncertainty that arises from the inherent randomness of natural phenomena. For natural data it is independent from epistemic uncertainty whereas interdependence with epistemic uncertainty exists for the parameters of models. For natural data it is controlled by precision and accuracy of the data. Aleatory uncertainty can be quantified in the form of probability distributions. Includes:

  • Measurement uncertainty - input and output variables cannot be determined with absolute precision and accuracy. All measurements are prone to some imprecision.

  • Sampling uncertainty - introduced when analyzing a random sample from a large population of interest. This random sample may capture effects that are spatially/temporally transient and overemphasize or miss effects. This variance is generally subsumed in an error term.

Mathematically this can be further explained as homoscedastic and heteroscedastic uncertainty. In homoscedastic uncertainty, the expected value or empirical mean \(\mathcal{E}[\sigma]\) in the given data is constant. An example can be a linear regression model \(y = f(x) +\epsilon \), where \(\epsilon \sim \mathcal{N}(\mu,\sigma^{2}) \) is independent of the explanatory variable \(x\). On the other hand, heteroscedastic uncertainty can be defined as the inherent randomness where the variance \(\sigma(x)\) is dependent upon the explanatory variable. The variance can be estimated either using Monte-Carlo methods or by using predictive machine learning algorithms. A real application for heteroscedastic uncertainty can be turbulance in fluid simulations, while for homoscedaastic uncertainty one can role a fair dice and estimate statistics of the outcomes.

In mathematics, uncertainties are communicated in terms of probability distributions. In terms of epistemic uncertainty one could argue that the underlying probability distribution in not certain. In the case of the aleatoric uncertainty a random sample from a known distribution has uncertainty in it.

Ontological Uncertainty#

A third category, ontological uncertainty, results from unconscious utilization of inappropriate methodology or belief systems. Ontological uncertainty is unrecognized and unquantifiable. Can refer to data, models, or methodology. includes:

  • Semantic uncertainty - arises if different meanings are ascribed to the same terms, phrases, or actions by participants in the same action. Occurs when methodological definitions lack clarity or are inappropriate for the full cognitive understanding of all participants, e.g., due to different level of expertise.

  • Interpretational uncertainty - arises if the information coded in data or models is extracted by interpreters following inconsistent decoding methodology. Occurs when applicable decoding methodology lacks clarity of definition.

UQ Cycle#

In the UQ cycle, uncertainty is organized into four types: natural uncertainty, measurement uncertainty, parameterization uncertainty, and description uncertainty.

Natural Uncertainty#

Natural uncertainty or natural variability is an inherent property of the underlying system and is the result of both spatial and temporal heterogeneity of the system. It is not possible to decrease natural variability by making more measurements with finer resolutions or by using better equipment, however better quality and quantity of measurements can olny allow better understanding of it. Any numerical or statistical model which represents a real world system(s) or process(es) undergoes natural variability. In climate/weather prediction many phenomenon such as ENSO, sudden stratospheric warming and etc. can not be modelled or predicted using mathematical models. Another example could be the impact of weather on production of renewable energy where the exact production cannot be estimated in advance based on weather data.

Measurement Uncertainty#

Observed data typically have measurement errors and detection uncertainties, perhaps also documentation errors. Example would be a measurement device that does not measure the exact variables. When dealing with issues of precision and accuracy this is aleatory uncertainty. Documentation errors are ontological uncertainty (semantic or interpretational).

Parameterization Uncertainty#

The extraction of specific probability distributions from data can be challenging. This can arise in predictive models if the exact values of the input parameters are unsure. This can also include model uncertainty if the underlying form of the parameter of interest is miss-specified (e.g. data is assumed to be normally distributed but isn’t).

Description Uncertainty#

The form of the model or even the fundamental science behind certain data might be unknown, hence linking back to nature and closing the cycle. This can encompass model uncertainty and possibly ontological uncertainty. Example: The causality of effects is unknown. The mechanistic model for a phenomenon is unknown.

References#

  • []

  • []

  • []

Contributor#

Tamadur Albaraghtheh