Markov Processes#

Part of a series: Stochastic fundamentals.

Follow reading here

A Markov process is a process without long-range memory. To formalize this concept we use the following definition:


A stochastic process \((X_t)_{t \in T}\) has the Markov property it for each \(A \subset \mathbb{R}\) and each sequence of times \(t\geq \tau_1\geq \tau_2\geq \tau_3\geq ...\) we have \(\begin{aligned} P(X_t \in A | X_{\tau_1}=x_1, X_{\tau_2}=x_2, ...)=P(X_t \in A | X_{\tau_1}=x_1) \end{aligned}\)


The random variable \(X_t\) is determined by its value at time \(\tau_1\). We do not gain further knowledge about \(X_t\) from earlier times \(\tau_2, \tau_3, ...\) . In this sense the process has no long long-range memory as the “older” values \(X_{\tau_2},X_{\tau_3},...\) are “forgotten”.


A Markov process is a stochastic process which possesses the Markov property. If it is time-discrete, one generally uses the term Markov chain.
For time-discrete stochastic processes we can allow for some finite memory effects by a simple-trick:
Let \((X_t)_{t \in \mathbb{N}_0}\) be the original stochastic process and define the multivariate stochastic process
\(\begin{aligned} Y_t &=\left(X_{t},X_{t-1},X_{t-2},...,X_{t-s}\right)^{\top}\\ Y& \colon \Omega\times T \rightarrow \mathbb{R}^s \end{aligned}\)

Now even if \(X_t\) depends on \(X_{t},X_{t-1},X_{t-2},...,X_{t-s}\) (i.e. has finite memory of steps) \((Y_t)_{t \in \mathbb{N}_0}\) has the Markov property. But although this trick appears to be simple in principle, it can be horribly complicated in practical applications when \(s\) gets larger.

The Chapman-Kolmogorov equation#

When discussing Baye’s theorem we found the following corollary: Let \(B\) be an event and \(A_1,A_2,...,A_s\) be a decomposition of the space of outcomes \(\Omega\) (i.e. we have \(A_j \cap A_u = \emptyset\) for all pairs \((j,k)\) and \(\cup_{j=1}^s A_j=\Omega\)) Then

\(\begin{aligned} P(B)=\sum_{j=1}^s P(B|A_j)P(A_j) \end{aligned}\)

Now consider a stochastic process and times \(t_1\geq t_2\geq t_3\). For simplicity assume that the process is value-discrete (i.e. assumes only values in \(\mathbb{Z}\)). We then have for the PMFs:

\(\begin{aligned} P(X_{t_1}=x_1)=\sum_{x_2}P(X_{t_1}=x_1|X_{t_2}=x_2)P(X_{t_2}=x_2) \end{aligned}\)


\(\begin{aligned} P(X_{t_1}=x_1,X_{t_3}=x_3)&=\sum_{x_2}P(X_{t_1}=x_1,X_{t_3}=x_3|X_{t_2}=x_2)P(X_{t_2}=x_2). \end{aligned}\)

Dividing by \(P(,X_{t_3}=x_3)\)

\(\begin{aligned} P(X_{t_1}=x_1|X_{t_3}=x_3)&= \sum_{x_2}P(X_{t_1}=x_1|X_{t_2}=x_2,X_{t_3}=x_3)P(X_{t_2}=x_2|X_{t_3}=x_) \end{aligned}\)

If the stochastic process has the Markov property, we have \(\begin{aligned} P(X_{t_1}=x_1|X_{t_2}=x_2,X_{t_3}=x_3)&=P(X_{t_1}=x_1|X_{t_2}=x_2) \end{aligned}\)

such that

\(\begin{aligned} P(X_{t_1}=x_1|X_{t_3}=x_3)&=\sum_{x_2}P(X_{t_1}=x_1|X_{t_2}=x_2)P(X_{t_2}=x_2|X_{t_3}=x_3) \end{aligned}\) This is the (discrete) Chapman-Kolmogorov equation, which provides the starting point for our further analysis of the Markov processes. (Always mind the ordering \(t_1\geq t_2\geq t_3\)).

  • We always deal with the conditional probabilities to study a stochastic process independent of the initial conditions. We clearly have \begin{aligned} P(X_{t}=x_t)&=\sum_{x_0}P(X_t=x_t|X_0=x_0)P(X_0=x_0) \end{aligned}

  • Things are most easily understood for a Markov chain (say \(t=0,1,2,...\)) with finite state space. Then

    \(\begin{aligned} P(X_t=x_t|X_0=x_0)&=\sum_{x_1,x_2,...,x_{t-1}}P(X_t=x_t|X_{t-1}=x_{t-1})P(X_{t-1}=x_{t-1}|X_{t-2}=x_{t-2})...\\ & ...\cdot P(X_{1}=x_{1}|X_{0}=x_{0}) \end{aligned}\)

    In each time step we thus multiply “transition matrices”. In the rest of the lecture we will mostly deall with value-continous stachastic processes. For the sake of simplicity we assume that PDFs exist for ass equalities of interest and we will extensively use conditional PDFs.
    The Markov property then reads

    \(\begin{aligned} t_{X_t}(x|X_{\tau_1}=x_{\tau_1},X_{\tau_2}=x_{\tau_2},...)=t_{X_t}(x|X_{\tau_1}=x_{\tau_1}) \end{aligned}\) for all sequences of time values with \(t\geq \tau_1\geq \tau_2\geq \tau_3\geq ...\). The Chapman-Kolmogorov equation becomes \(\begin{aligned} t_{X_{t_1}}(x_1|X_{t_3}=x_3)&=\int \textrm{d}x_2 \,\, t_{X_{t_1}}(x_1|X_{t_2}=x_2)t_{X_{t_2}}(x_2|X_{t_3}=x_3) \end{aligned}\) This is the starting point of the Master equation, the Fokker-Planck equation,…

Note Both Gardiner in his book “Stochastic Methods”[Gardiner, 2009] and von Kampen in “Stochastic Processes in Physics and Chemistry”[Van Kampen, 1992] use the simpler Notation \(\begin{aligned} p(x_1, t_1| x_3, t_3) &= \int dx_2 p(x_1, t_1 | x_2, t_2) p(x_2, t_2| x_3, t_3) \end{aligned}\) for the Chapman-Komogorov equations (\(t_1 \geq t_2 \geq t_3\)).



Crispin W Gardiner. Stochastic Methods: A Handbook for the Natrual and Social Sciences. Springer-Verlag, 2009. ISBN 978-3642089626.


Nicolaas Godfried Van Kampen. Stochastic processes in physics and chemistry. Elsevier, 1992. ISBN 978-0444529657.


Philipp Böttcher, Dirk Witthaut