Chapter 3. Hitting times and Hitting Probabilities
Definitions
Definition 1 (First Hitting Time) Let \(\mathbf{X}=(X_t)_{t\in \mathbb{N}}\) be a time-homogeneous Markov chain on a state space \(S\). For any subset \(A \subseteq S\), the first hitting time (or first passage time) to \(A\) is the random variable \(\tau_A \colon \Omega \to \mathbb{N} \cup \{\infty\}\) defined as \[ \tau_A := \inf \{ t \ge 0 \st X_t \in A \} \] where \(\inf \emptyset = \infty\). If \(A = \{x\}\) for a single state \(x\), we often write \(\tau_x\) instead of \(\tau_{\{x\}}\).
Definition 2 (Hitting Probability) For \(A \subset S\) and \(x \in S\), the hitting probability of \(A\) from \(x\) is defined as \[ h_A(x) := \mathbb{P}_x(\tau_A < \infty) \] This is the probability that the Markov chain, starting from state \(x\), ever visits any state in the set \(A\).
Main results
Theorem 1 The hitting probability \(h_A(x) = \mathbb{P}_x(\tau_A < \infty)\) satisfies the following system of equations in the unknown \(h\) \[ \begin{cases} h(x)=1 & \text{if $ x\in A $} \\ ((I-P)h)(x)=0 & \text{if $x\in A^c$} \end{cases} \tag{1}\]
Furthermore, \(h_A(x)\) is the smallest non-negative solution to these equations. Namely if \(g\ge 0\) solves the system Equation 1, then \(g\ge h_A\).
Finally, \(h_A\) is the unique solution \(g\ge 0\) to Equation 1 such that \[ \lim_{n\to \infty} \mathbb{E}_x[g(X_n) \ind{\tau_A > n}] = 0 \]
Proof. We proceed step by step.
(Boundary condition) If \(x\in A\), then \(\tau_A=0\) \(\mathbb{P}_x\)-a.s., and therefore \(h_A(x)=1\).
(Harmonicity) As \(x\in A^c\), we have \(\{\tau_A<\infty\}=\{ \exists t\ge 0 \st X_t\in A\}= \{ \exists t\ge 1 \st X_t\in A\}\) with \(\mathbb{P}_x\)-probability \(1\). Thus conditioning on the first step of the chain and using the Markov property for the function \(\ind{\{\tau_A<\infty\}}\) \[ \begin{aligned} h_A(x) &= \mathbb{P}_x(\tau_A < \infty) = \sum_{y \in S} \mathbb{P}_x(\tau_A < \infty \mid X_1 = y) \mathbb{P}_x(X_1 = y) \\ & = \sum_{y \in S} \mathbb{P}_y(\tau_A < \infty) p_{x,y} = \sum_{y \in S} p_{x,y} h_A(y) \end{aligned} \]
-
(Minimality) Let \(g \colon S \to [0, \infty)\) be any non-negative function satisfying Equation 1. Let us prove by induction that \[ g(x)=\mathbb{P}_x(\tau_A \le n) + \mathbb{E}_x[g(X_n) \ind{\tau_A > n}] \tag{2}\]
- Take \(n=1\). If \(x\in A\), then the claim reduces to \(1=1+0\). If \(x\not \in A\), then from Equation 1 \[ g(x) = \mathbb{E}_x[g(X_1)]= \mathbb{E}_x[\ind{A}(X_1)]+ \mathbb{E}_x[g(X_1)\ind{A^c}(X_1)] = \mathbb{E}_x[\ind{A^c}(X_0) \ind{A}(X_1)]+ \mathbb{E}_x[g(X_1) \ind{A^c}(X_0) \ind{A^c}(X_1)] \] which is equivalent to the claim by the definition of \(\tau_A\).
- Assume the claim for a given \(n\). Since \(\{\tau_A > n\} \subset \{X_n \not \in A\}\), \(g(X_n)= \mathbb{E}_{X_n}[g(X_1)]\), and reasoning as in the case \(n=1\) \[ \begin{aligned} \mathbb{E}_x[g(X_n) \ind{\tau_A > n}] & = \mathbb{E}_x[\mathbb{E}_{X_n}[g(X_1)] \ind{\tau_A > n}] = \mathbb{E}_x[g(X_{n+1}) \ind{\tau_A > n}] \\ & = \mathbb{E}_x[\ind{X_{n+1}\in A} \ind{\tau_A > n}] + \mathbb{E}_x[g(X_{n+1}) \ind{X_{n+1}\not \in A} \ind{\tau_A > n}] = \mathbb{E}_x[\ind{\tau_A = n+1}] + \mathbb{E}_x[g(X_{n+1})\ind{\tau_A > n+1}] \end{aligned} \] From the inductive hypothesis and the last equality \[ g(x)= \mathbb{P}_x(\tau_A \le n) + \mathbb{E}_x[\ind{\tau_A = n+1}] + \mathbb{E}_x[g(X_{n+1})\ind{\tau_A > n+1}] = \mathbb{P}_x(\tau_A \le n+1 )+ \mathbb{E}_x[g(X_{n+1})\ind{\tau_A > n+1}] \] Equation 2 is thus established. By monotone convergence, \(\lim_n \mathbb{P}_x(\tau_A \le n)= \mathbb{P}_x(\tau_A<\infty )=h_A(x)\). Therefore \[ g(x)= h_A(x)+ \lim_n \mathbb{E}_x[g(X_n) \ind{\tau_A > n}] \ge h_A(x) \]
(Uniqueness) From the last equation, we see that the limit in that equation vanishes iff \(g=h_A\).
Examples
Example 1
In this graph, the arrows represent strictly positive transition probabilities. Consider \(h_{b}(x)\) the probability of hitting \(b\) when starting in \(x\). On \(\{a,b,c\}\) the function \(h_{b}(x)\) is uniquely determined by the linear system Equation 1. On the other hand, on \(\{d,e,f,g\}\) (which is disconnected from \(b\)), \(h_{b}(x)=0\); but \(h(x)=c\) for \(x\in \{d,e,f,g\}\) solves the harmonic equation Equation 1 for any constant \(c\). We see indeed that \(h_b\) is the minimal non-negative solution to the system Equation 1.
Exercise 1 Let \(A,B \subset S\) with \(A \cap B=\emptyset\). Find the equation similar to Equation 1 satisfied by \[ h_{A,B}(x):= \mathbb{P}_x(\tau_A < \tau_B, \tau_{A\cup B}<\infty) \] using two different methods
- Conditioning on the first step, as in the proof of Theorem 1.
- Considering a modified Markov chain with transition probabilities \[ q_{x,y}:= \begin{cases} p_{x,y} & \text{if $x\not \in B $} \\ \ind{x}(y) & \text{if $x\in B$} \end{cases} \]
Exercise 2 We want to give an analytical characterization of the transition probabilities \(q_{x,y}\) of the trace of a recurrent Markov chain. We proved that the trace of \(\mathbf{X}\) on \(A\) has transition probabilities, for \(x,y\in A\) \[ q_{x,y}=\mathbb{P}_x(X_{\tau^+_A}=y) \] Define the operator \(P^{(A^c)}\) with entries \[ p^{(A^c)}_{x,y}= \begin{cases} p_{x,y} & \text{if $y\in A^c$} \\ 0 & \text{if $y\in A$} \end{cases} \] Prove that, for each fixed \(y\in A\), \(q_{x,y}\) is the minimal solution to the problem in the unknown \(h \ge 0\) \[ ((I-P^{(A^c)}) h)(x) = p_{x,y}, \qquad x\in S \tag{3}\] meaning \(q_{x,y} - \sum_{z \in A^c} p_{x,z} q_{z,y} = p_{x,y}\) for all \(x\in S\).
Fix \(y \in A\), and denote \(h(x) \equiv q_{x,y}= \mathbb{P}_x(X_{\tau_A^+} = y)\) for \(x\in S\). By conditioning on the first step: \[ h(x) = \sum_{z \in S} p_{x,z} \mathbb{P}_x(X_{\tau_A^+}=y | X_1=z) = \sum_{z\in A} p_{x,z} \ind{y=z}+\sum_{z\in A^c} p_{x,z} h(z)= p_{x,y}+\sum_{z\in A^c} p_{x,z} h(z) \] To prove minimality, let now \(g \ge 0\) be any non-negative solution to the system Equation 3 for a fixed \(y\in A\) (which we omit from the notation). Define \(h^{(n)}(x):= \mathbb{P}_x(X_{\tau_A^+} = y, \tau_A^+\le n)\). Then \(h^{(n)}(x) \le q_{x,y}\), and by the continuity of probabilities on monotone sequences, \(h^{(n)}(x) \uparrow h(x)\). Therefore it is enough to check that \(g\ge h^{(n)}\) for all \(n\ge 0\). Now we proceed by induction.
- Trivially \(g\ge h^{(0)}\equiv 0\).
- \(h^{(n)}\) satisfies (reasoning as for \(q_{\cdot,y}\) above) \[ h^{(n+1)}(x)=p_{x,y}+\sum_{z\in A^c} p_{x,z} h^{(n)}(z) \] Therefore by the induction hypothesis \[ g(x)= p_{x,y}+(P^{(A^c)}g)(x) \ge p_{x,y}+(P^{(A^c)}h^{(n)})(x) = h^{(n+1)}(x) \]
Exercise 3 (Gambler’s ruin) A person goes to the casino with an initial capital of \(x\) dollars. At each game, they win a dollar with probability \(p\) and lose a dollar with probability \(1-p\). Their strategy is not to leave until they have a capital of \(N\) dollars, or they are broke. Find the probability that the person will be ruined in the game.
Exercise 4 (Birth and Death) We observe a population of bacteria and note each time one of them reproduces (the total population increases by \(1\)) or dies (the total population decreases by \(1\)). The probability of increase/decrease of the population depends on the population size. Say that if the population at a given moment is \(n\ge 1\), the probability that a reproduction happens before a death is \(r_n\) (and a death happens before a reproduction with probability \(1-r_n\)). Find the probability that the bacteria will go extinct, as a function of the sequence \((r_n)_{n\ge 1}\).
Exercise 5 You are walking in the desert, without water and with \(2000\) dollars. You find a magic lamp, and as you take it an evil magic Italian merchant appears, selling bottled water. You think you are safe, but there is a catch. One bottle costs \(10000\) dollars.
The Italian thus offers you the following game: you can bet any amount of money \(b\) on a six-face die roll. If you get 5 or 6 (thus with probability \(p=1/3\)), you win \(b\). Otherwise your money decreases by \(b\). You can play as many rounds as you want, until you run out of money or decide to stop. Your goal is to maximize the probability of buying a bottle of water for the cheap price offered. Compute such an optimal probability.
Hitting probabilities
In the context of generic Markov chains, we can define a stopping time \(\tau \colon \Omega \to \Theta\) as a map such that \(\{\tau \le t\}\) is \(\mathcal{F}_t\)-measurable for all \(t\in \Theta\). One can then define the \(\sigma\)-algebra of the events measurable up to the time \(\tau\): \[ \mathcal{F}_{\tau}:= \left\{ A \in \mathcal{F} \st A \cap \{\tau \le t\} \in \mathcal{F}_t \text{ for all } t \in \Theta \right\} \]
If we take \(\Theta=\mathbb{N}\cup \{\infty\}\) or \(\Theta=[0,\infty]\), one says that the strong Markov Property holds if \[ \mathbb{E}_\mu[F(\theta_\tau \mathbf{X})\mid X_\tau=x, \tau<\infty]= \mathbb{E}_x[F(\mathbf{X})] \] for all bounded measurable maps \(F\colon D(E)\to \mathbb{R}\) and for all stopping times such that \(\mathbb{P}_\mu(\tau<\infty)>0\). Notice that in this general framework, one needs some further assumptions to guarantee that this holds, it is not just a consequence of the Markov property. From which the name strong Markov. In this sense, we proved that a discrete-time (\(\Theta=\mathbb{N}\)) Markov chain on a countable state space (or indeed on Polish spaces with minor adaptations) has the strong Markov property.