Description (DRAFT)#
Part of a series: Data-induced Uncertainties.
Follow reading here
Uncertainty is described differently in each system based on the perspective and the ultimate goal of the model. Data collection, analysis, and measurement methodologies can be affected by the various types of uncertainties associated with the formulation of the model and the fundamental phenomenon associated with it. Ontological uncertainty as well as model uncertainty can be included in this. As a result, it is imperative to clearly describe the system goals as well as understand the uncertainty sources and roles within any system, hence, linking back to nature and closing the UQ cycle.
The usual setting when one has data at hand is to turn them into actionable knowledge, via incorporating some mechanistic model that can lead to meaningful interpretations, or looking for causal effects that lead to the data and investigate it more. But sometimes the data, and particularly experimental data, are collected with no model in mind and could also be incomplete. When this happens and we have a big set of data, we may try to use some data-driven approaches {cite}’geris2016uncertainty’ such as singular value decomposition, principal component analysis or partial least square regression. These models ignore the mechanistic aspect of the data but try to identify certain behaviours observed experimentally such as identifying a Biomarker in a biological dataset. The other tempting case is when we know the cause we tend aslo to look for the mechanism which linked the cause with the effect. But when the cause is not there it is impossible to build a mechanistic model for the data.
The nicer case is where we have some candidates as possible mechanistic generating processes, but we are not entirely sure which one is the right onw. For example we may have candidate distributions from which the data was sampled. The question is how to use such uncertainty? We can use for instance Bayesian updating, where the prior knowledge is set on the different possible distributions. This idea could also be similar to model comparision analysis. Once one of the possible models is favoured by the data we can move to other steps of our data analysis.
The data can also be computed in the sense that there is something mission or modified, which can also lead to difficulty of building a mechanistic process or a causal one.