Markov Processes#
Part of a series: Stochastic fundamentals.
Follow reading here
A Markov process is a process without long-range memory. To formalize this concept we use the following definition:
Definition#
A stochastic process \((X_t)_{t \in T}\) has the Markov property it for each \(A \subset \mathbb{R}\) and each sequence of times \(t\geq \tau_1\geq \tau_2\geq \tau_3\geq ...\) we have \(\begin{aligned} P(X_t \in A | X_{\tau_1}=x_1, X_{\tau_2}=x_2, ...)=P(X_t \in A | X_{\tau_1}=x_1) \end{aligned}\)
Interpretation#
The random variable \(X_t\) is determined by its value at time \(\tau_1\). We do not gain further knowledge about \(X_t\) from earlier times \(\tau_2, \tau_3, ...\) . In this sense the process has no long long-range memory as the “older” values \(X_{\tau_2},X_{\tau_3},...\) are “forgotten”.
Terminology#
A Markov process is a stochastic process which possesses the Markov property. If it is time-discrete, one generally uses the term Markov
chain.
For time-discrete stochastic processes we can allow for some finite
memory effects by a simple-trick:
Let \((X_t)_{t \in \mathbb{N}_0}\) be the original stochastic process and
define the multivariate stochastic process
\(\begin{aligned}
Y_t &=\left(X_{t},X_{t-1},X_{t-2},...,X_{t-s}\right)^{\top}\\
Y& \colon \Omega\times T \rightarrow \mathbb{R}^s
\end{aligned}\)
Now even if \(X_t\) depends on \(X_{t},X_{t-1},X_{t-2},...,X_{t-s}\) (i.e. has finite memory of steps) \((Y_t)_{t \in \mathbb{N}_0}\) has the Markov property. But although this trick appears to be simple in principle, it can be horribly complicated in practical applications when \(s\) gets larger.
The Chapman-Kolmogorov equation#
When discussing Baye’s theorem we found the following corollary: Let \(B\) be an event and \(A_1,A_2,...,A_s\) be a decomposition of the space of outcomes \(\Omega\) (i.e. we have \(A_j \cap A_u = \emptyset\) for all pairs \((j,k)\) and \(\cup_{j=1}^s A_j=\Omega\)) Then
\(\begin{aligned} P(B)=\sum_{j=1}^s P(B|A_j)P(A_j) \end{aligned}\)
Now consider a stochastic process and times \(t_1\geq t_2\geq t_3\). For simplicity assume that the process is value-discrete (i.e. assumes only values in \(\mathbb{Z}\)). We then have for the PMFs:
\(\begin{aligned} P(X_{t_1}=x_1)=\sum_{x_2}P(X_{t_1}=x_1|X_{t_2}=x_2)P(X_{t_2}=x_2) \end{aligned}\)
and
\(\begin{aligned} P(X_{t_1}=x_1,X_{t_3}=x_3)&=\sum_{x_2}P(X_{t_1}=x_1,X_{t_3}=x_3|X_{t_2}=x_2)P(X_{t_2}=x_2). \end{aligned}\)
Dividing by \(P(,X_{t_3}=x_3)\)
\(\begin{aligned} P(X_{t_1}=x_1|X_{t_3}=x_3)&= \sum_{x_2}P(X_{t_1}=x_1|X_{t_2}=x_2,X_{t_3}=x_3)P(X_{t_2}=x_2|X_{t_3}=x_) \end{aligned}\)
If the stochastic process has the Markov property, we have \(\begin{aligned} P(X_{t_1}=x_1|X_{t_2}=x_2,X_{t_3}=x_3)&=P(X_{t_1}=x_1|X_{t_2}=x_2) \end{aligned}\)
such that
\(\begin{aligned} P(X_{t_1}=x_1|X_{t_3}=x_3)&=\sum_{x_2}P(X_{t_1}=x_1|X_{t_2}=x_2)P(X_{t_2}=x_2|X_{t_3}=x_3) \end{aligned}\) This is the (discrete) Chapman-Kolmogorov equation, which provides the starting point for our further analysis of the Markov processes. (Always mind the ordering \(t_1\geq t_2\geq t_3\)).
We always deal with the conditional probabilities to study a stochastic process independent of the initial conditions. We clearly have \begin{aligned} P(X_{t}=x_t)&=\sum_{x_0}P(X_t=x_t|X_0=x_0)P(X_0=x_0) \end{aligned}
Things are most easily understood for a Markov chain (say \(t=0,1,2,...\)) with finite state space. Then
\(\begin{aligned} P(X_t=x_t|X_0=x_0)&=\sum_{x_1,x_2,...,x_{t-1}}P(X_t=x_t|X_{t-1}=x_{t-1})P(X_{t-1}=x_{t-1}|X_{t-2}=x_{t-2})...\\ & ...\cdot P(X_{1}=x_{1}|X_{0}=x_{0}) \end{aligned}\)
In each time step we thus multiply “transition matrices”. In the rest of the lecture we will mostly deall with value-continous stachastic processes. For the sake of simplicity we assume that PDFs exist for ass equalities of interest and we will extensively use conditional PDFs.
The Markov property then reads\(\begin{aligned} t_{X_t}(x|X_{\tau_1}=x_{\tau_1},X_{\tau_2}=x_{\tau_2},...)=t_{X_t}(x|X_{\tau_1}=x_{\tau_1}) \end{aligned}\) for all sequences of time values with \(t\geq \tau_1\geq \tau_2\geq \tau_3\geq ...\). The Chapman-Kolmogorov equation becomes \(\begin{aligned} t_{X_{t_1}}(x_1|X_{t_3}=x_3)&=\int \textrm{d}x_2 \,\, t_{X_{t_1}}(x_1|X_{t_2}=x_2)t_{X_{t_2}}(x_2|X_{t_3}=x_3) \end{aligned}\) This is the starting point of the Master equation, the Fokker-Planck equation,…
Note Both Gardiner in his book “Stochastic Methods”[Gardiner, 2009] and von Kampen in “Stochastic Processes in Physics and Chemistry”[Van Kampen, 1992] use the simpler Notation \(\begin{aligned} p(x_1, t_1| x_3, t_3) &= \int dx_2 p(x_1, t_1 | x_2, t_2) p(x_2, t_2| x_3, t_3) \end{aligned}\) for the Chapman-Komogorov equations (\(t_1 \geq t_2 \geq t_3\)).