Chapter 7.2. Selected Topics: Galton-Watson Trees

\[\newcommand{\ind}[1]{\mathbf{1}_{#1}} \newcommand{\dd}{\mathrm{d}}\]

The Galton-Watson process is a stochastic model that describes the evolution of a population where each individual reproduces independently according to the same probability distribution. The model was first introduced by Irénée-Jules Bienaymé in 1845, and later independently developed by Francis Galton and Henry William Watson in 1874 to study the extinction of family surnames.

The original motivation came from Galton’s interest in genealogical trees, and whether aristocratic families would persist over time. Watson and Galton derived the probability of surname extinction, finding that family names would eventually disappear. The model has since found applications beyond genealogy, including Biology (population dynamics, cell division), Physics (branching processes in nuclear reactions), Computer Science (analysis of tree data structures), and Economics (branching of firms and organizations).

The population size in the next generation depends only on the current population size and the reproduction law, thus the sequence of population sizes forms a Markov chain.

Definitions and Results

Definition 1 (Galton-Watson Process) Let $X_0, X_1, X_2, \ldots$ be a sequence of random variables (representing the population sizes in each generation). We say that $(X_t)_{t \ge 0}$ forms a Galton-Watson process if:

$X_0 = 1$ (we start with a single ancestor).
For each $t \ge 1$, $X_{t+1} = \sum_{n=1}^{X_t} Z_{t,n}$, where $Z_{t,n}$ represents the number of offspring of the $n$-th individual in generation $t$.
The random variables $(Z_{t,n})_{t \ge 0, n \ge 1}$ are i.i.d. on $\mathbb{N}$: $\mathbb{P}(Z_{t,n} = k)=p_k$ with $\sum_{k\ge 0} p_k=1$.

The probability distribution $(p_k)_{k \ge 0}$ is called the offspring distribution of the process.

Remark. The Galton-Watson process is a Markov chain on the state space $S = \mathbb{N}$, where state $0$ is absorbing (once the population goes extinct, it vanishes forever). If $X_t = 0$, then $X_{t+1} = X_{t+2} = \cdots = 0$. Given $X_t=n$, $X_{t+1}$ is a sum of $n$ i.i.d. random variables with distribution $(p_k)$. Therefore, defining $(p^{(n)}_k)$ as the $n$-th convolution of $(p_k)$, we have the transition probabilities for $X_t$: \[ \mathbb{P}(X_{t+1}=k\mid X_t=n)=p^{(n)}_k \] Notice that $(p^{(n)})$ is determined by \[ \begin{aligned} & p^{(1)}_k=p_k \\ & p^{(n+1)}_k= \sum_{h} p^{(n)}_{h}p_{k-h} \end{aligned} \]

In particular it is important to characterize the probability¹ $\zeta=\mathbb{P}(\tau_0<\infty)$, the probability that the process will go extinct. While not particularly useful for the computations, we can remark that for the process to go extinct at time $t$, all the individuals at time $t-1$ must have $0$ offspring: \[ \zeta= \sum_{t=1}^\infty \mathbb{P}(\tau_0=t)=\sum_{t=1}^\infty \sum_{k=0}^\infty \mathbb{P}(\tau_0=t\mid X_{t-1}=k) \mathbb{P}(X_{t-1}=k)= \sum_{t,k=1}^\infty \mathbb{P}(X_{t-1}=k) p_0^k \] The value of $\zeta$ is determined by the values of the offspring distribution $(p_k)$, so we want to understand in particular when $\zeta=1$ and when $\zeta<1$. Define the function \[ \phi(z) := \sum_{k=0}^\infty p_k z^k, \qquad |z|\le 1 \] which is the generating function of the offspring distribution. In particular \[ \phi(0)=p_0, \phi(1)=1, \phi'(1)= \mathbb{E}[Z] = \sum_{k=0}^\infty k p_k \]

Lemma 1 Let $\phi_t(z):=\sum_{k=0}^\infty \mathbb{P}(X_t=k) z^k=\mathbb{E}\left[z^{X_t}\right]$, for $z\in [0,1]$. Then it holds \[ \begin{cases} & \phi_0(z)=z \\ & \phi_1(z)=\phi(z) \\ & \phi_{t+1}(z)= \phi(\phi_t(z)) \end{cases} \] Moreover $\phi_t$ is non-negative, non-decreasing, convex, analytic in $0\le z<1$.

Proof. The identities for $\phi_0$ and $\phi_1$ follow from the definition. Conditioning on $Z_{1}$ we have \[ \begin{aligned} \phi_{t+1}(z) & =\sum_{k=0}^\infty \mathbb{P}(X_{t+1}=k) z^k= \sum_{k=0}^\infty \sum_{j=0}^\infty \mathbb{P}(X_{t+1}=k\mid X_{1}=j) \mathbb{P}(X_1=j) z^k \\ & =\sum_{j=0}^\infty p_j \left(\sum_{k=0}^\infty \mathbb{P}(X_t=k) z^k\right)^j=\phi(\phi_t(z)) \end{aligned} \] Alternatively, using conditional expectation, $\mathbb{E}[z^{X_{t+1}}]=\mathbb{E}[\mathbb{E}[z^{X_{t+1}} \mid X_1]]=\mathbb{E}[\mathbb{E}[z^{X_{t}}]^{X_1}]=\phi(\phi_{t}(z))$ since, conditional on $X_1$, $X_{t+1}$ is the sum of $X_1$ independent Galton-Watson processes.

Being a power-series with non-negative coefficients, $\phi$ is non-negative, non-decreasing, convex, and analytic in $|z|< 1$. Thus its powers $\phi\circ\phi\circ \ldots \circ \phi$ and also non-negative, non-decreasing, convex, analytic.

Theorem 1 Let $(X_t)_{t \ge 0}$ be a Galton-Watson process with offspring distribution having probability generating function $\phi$ and mean $\mu = \mathbb{E}[Z] = \phi'(1)\in [0,\infty]$. Then $\zeta$ is the smallest fixed point of $\phi$, i.e. the smallest solution to the equation \[ z = \phi(z), \qquad z\in [0,1] \] In particular

If $p_0=0$, trivially $\zeta=0$.
If $p_0>0$ and $\mu \le 1$, then the population eventually goes extinct a.s., that is $\zeta=1$.
If $\mu > 1$, then $\zeta<1$ and \[ \mathbb{P}(\lim_{t\to \infty} X_t=\infty) =1-\zeta \]

Proof. If $X_1=k$, there will be extinction iff the $k$ (independent) trees descending from each of the $k$ individuals at time $1$ will go extinct, thus with probability $\zeta^k$. Thus, conditioning on $X_1$ \[ \zeta=\sum_{k=0}^\infty \mathbb{P}(\tau_0<\infty\mid X_1=k) \mathbb{P}(X_1=k)= \sum_{k=0}^\infty p_k \zeta^k=\phi(\zeta) \] Namely $\zeta$ is a fixed point of $\phi$. We need to check that it is the minimal one.

Let $\zeta_t = \mathbb{P}(X_t = 0)=\phi_t(0)$ be the probability of extinction by generation $t$. Then $\zeta_t$ is non-decreasing and $\zeta = \lim_{t \to \infty} \zeta_t$, by the continuity of probability measures on monotone sequences. Let $\xi$ be any other fixed point $\xi=\phi(\xi)$, then since $\phi_t(z)$ is non-decreasing in $z$ \[ \zeta_t=\phi_t(0)\le \phi_t(\xi)=\xi \] Thus taking the limit, $\zeta\le \xi$.

If $p_0=0$ then $\phi(0)=0$ and thus $\zeta=0$ is the minimal fixed point, proving point 1..

Notice that $\xi=1$ is always a fixed point, as $1=\phi(1)$. However, since $\phi$ is nondecreasing and convex, assuming $p_0>0$ implies that there exists at most one other fixed point. Indeed, by elementary calculus:

if $\phi'(1)>1$, there exists exactly one other fixed point $\zeta<1$.
if $\phi'(1)\le 1$, $1$ is the unique fixed point, thus $\zeta=\xi=1$.

We next check that $\mathbb{P}(\lim_{t\to \infty} X_t=\infty)=1-\zeta$. For each $t$ such that $X_t\le k$, there is a probability at least $p_0^k$ of being extinct at the next step. So \[ \mathbb{P}(\tau_0<\infty \mid |\{t\mid X_t\le k\}|=\infty)=1 \] Therefore, on $\{\tau_0=\infty\}$, we have that $X_t\le k$ only for finitely many times a.s., for any $k$, and thus $X_t$ diverges a.s. on $\{\tau_0=\infty\}$.

Extinction probability is the smallest fixed point of the generating function of the offspring. — Figure 2: The extinction probability is the smallest fixed point of $\phi(z)$. Except in trivial cases ($p_1=1$, which gives $\phi(z)\equiv z$ and $\zeta=0$), there are only two possibilities. If $\phi'(1)\le 1$, there is only one such fixed point, $\zeta=1$. If $\phi'(1)>1$, there exist two fixed points: $1$ and $\zeta<1$.

Basic Examples

Example 1 (Binary splitting) Consider a bacterium having probability $p_0\in (0,1)$ to die before replication, probability $p_2\in (0,1)$ to replicate into two exact copies (mitosis), and probability $p_1=1-p_0-p_2$ to undergo a mitosis where one of the two copies dies immediately (so the net offspring has size $1$). The offspring distribution has $p_k=0$ for $k\ge 3$ and $\mu=1-p_0+p_2$. To find the probability $\zeta$ of extinction, we need to solve \[ z = \phi(z) \equiv p_0 + p_1 z + p_2 z^2 \equiv p_0+(1-p_0-p_2) z +p_2 z^2 \] The solutions are $z=1$ and $z=p_0/p_2$. Therefore \[ \zeta= \begin{cases} p_0/p_2 & \text{if $p_0<p_2$} \\ 1 & \text{if $p_0 \ge p_2$} \end{cases} \] and indeed $\zeta=1$ iff $\mu \le 1$.

Example 2 (Poisson offspring distribution) Suppose the number of offspring follows a Poisson distribution with parameter $\lambda > 0$: \[ p_k = \frac{e^{-\lambda} \lambda^k}{k!}, \quad k \in \mathbb{N} \] The probability generating function is: \[ \phi(z) = \sum_{k=0}^\infty z^k \frac{e^{-\lambda} \lambda^k}{k!} = e^{\lambda(z-1)} \]

The mean is $\mu = \lambda$, so:

If $\lambda \le 1$, extinction occurs with probability 1.
If $\lambda > 1$, the extinction probability $\zeta$ is the smallest solution to $e^{\lambda(z-1)}=z$.

Exercise 1 Consider a Galton-Watson process where each individual has a random number of offspring according to the geometric distribution with parameter $p \in (0,1)$, i.e., $p_k = p(1-p)^k$ for $k \ge 0$. Calculate the extinction probability and the expected value of the offspring distribution, in terms of $p$.

Solution

We have $\phi(z) = \sum_{k=0}^\infty p(1-p)^k z^k = \frac{p}{1 - (1-p)z}$ and $\mu = \phi'(1) = (1-p)/p$. Solving $\phi(z)=z$ leads to the roots are $1$ and $\frac{p}{1-p}$. Thus \[ \zeta = \begin{cases} 1 & \text{if $ p \ge 1/2$} \\ \frac{p}{1-p} & \text{if $p < 1/2$} \end{cases} \] consistently with the value of $\mu$.

Exercise 2 Consider a random walk $Y_t$ on $\mathbb{Z}$ with $p_{x,x+1}=1-p_{x,x-1}=p \in (0,1/2]$, conditioned to start positive, that is we assume $Y_0=0$, $Y_1=+1$ for the sake of simplicity. Let $\tau_0^+$ be, as usual, the time of the first return to $0$, so it is finite a.s. since $p\le 1/2$, and define \[ X_n:=| \{0\le t < \tau_0^+ \mid Y_{t}=n, Y_{t+1}=n+1 \}| \] as the number of times that $Y_t$ crosses the edge $(n,n+1)$, before returning to $0$.

Prove that $(X_n)$ is a Galton-Watson process, find the offspring distribution and the extinction probability $\zeta$.

Solution

Fix $n\ge 1$ and let $\sigma_l$ be the time of the $l$-th crossing of $(n,n+1)$, thus $Y_{\sigma_l-1}=n$ and $Y_{\sigma_l}=n+1$. The stopping time \[ \xi_l:=\inf \{t>\sigma_l : Y_t=n\} \in \mathbb{N} \] is the first return to level $n$ after the $k$-th crossing. The number of crossings of $(n+1,n+2)$ that occur in the open interval $(\sigma_k,\xi_k)$ is \[ Z_{n,l}:=|\{\sigma_l <t<\xi_l : Y_t=n+1, Y_{t+1}=n+2\}| \] By the strong Markov property, the $Z_{n,l}$ are independent and thus the conditions of Definition 1 hold.

We interpret the crossing of $(n, n+1)$ as a “parent”. Starting at $n+1$, the random walk must either step to $n+2$ (creating an offspring) or step to $n$ (dying). Since $p_{x,x+1}=p$, the probability of stepping to $n+2$ before $n$ is simply $p$ (a single step right), and the probability of stepping to $n$ before $n+2$ is $1-p$ (a single step left). If the walk hits $n+2$, it returns to $n+1$ a.s., allowing for further “offspring” attempts. Thus, $Z_{n,l}$ follows a geometric distribution with success parameter $p$: \[ \mathbb{P}(Z_{n,l} = k) = p^k (1-p), \quad k \ge 0 \] We thus have $\phi(z)=(1-p)/(1-pz)$, $\mu=\phi'(1)=p/(1-p)\le 1$ since $p\le 1/2$. Therefore $\zeta=1$ for all $p\le 1/2$ (if $p>1/2$ then the argument above is not correct, since the return to $0$ from $Y_1=1$ is not a.s.).

Moments and Asymptotic Behavior

We next examine the long-time behavior, in particular in the supercritical case $\mu>1$.

Proposition 1 Let $(X_t)_{t \ge 0}$ be a Galton-Watson process with offspring distribution $(p_k)$.

For $\mu = \mathbb{E}[Z]=\sum_k k p_k$, it holds $\mathbb{E}[X_t] = \mu^t$ for $t \ge 0$.
Assuming $\sigma^2 := \operatorname{Var}(Z) = \sum_k (k-\mu)^2 p_k <\infty$, it holds for $t \ge 0$ \[ \operatorname{Var}(X_t)= \begin{cases} \sigma^2 \mu^{t-1} \frac{\mu^t - 1}{\mu - 1} & \text{if $\mu \neq 1$} \\ t\sigma^2 & \text{if $\mu = 1$} \end{cases} \]

Proof.

$\mathbb{E}[X_t]=\phi_t'(1)$. So by induction, using Lemma 1, $\phi_1'(1)=\phi'(1)=\mu$. $\phi_{t+1}'(1)= \phi'(\phi_{t}(1)) \phi_{t}'(1)$. But $\phi_{t}(1)=1$ and $\phi_{t}'(1)=\mu^t$ by the induction hypothesis.
For the variance, we reason similarly using $\operatorname{Var}(X_t)= \mathbb{E}[X_t^2]-\mathbb{E}[X_t]^2 = \phi_t''(1)+\phi_t'(1)-\phi_t'(1)^2$, and reasoning by induction.

Theorem 2 There exists a random variable $M\ge 0$ with $\mathbb{E}[M]<\infty$ and $\mathbb{P}(M=0)\ge \zeta$, such that the limit \[ \lim_{t\to \infty} X_t \mu^{-t}=M \] holds a.s.

Proof. By the definition of $X_t$ \[ \mathbb{E}[X_t \mid (X_{t-1},X_{t-2},\ldots,X_0)]= \mu X_{t-1} \] Therefore $M_t:=X_t \mu^{-t}$ is a non-negative martingale. By Doob’s convergence theorem, the limit $M$ of $M_t$ exists a.s. and is integrable. Whenever $\tau_0<\infty$, $X_t$ vanishes eventually, so $M=0$ on $\{\tau_0<\infty\}$.

Remark. The previous theorem establishes that $X_t \mu^{-t}$ stabilizes a.s. (possibly at $0$) as $t\to \infty$: \[ X_t= \mu^{-t}(M+o_t(1)) \] as $t\to \infty$. This asymptotic behavior can be visualized through numerical simulatio: each time we sample a Galton-Watson process, we get a possibly (if $\mu>1$) different limit $M$ (i.e. the limit $M$ is random). So on each simulation $\omega$, we build the random variable $M(\omega)$, just taking the limit. If we want to estimate the distribution of $M$, we can just estimate numerically $\mathbb{P}(M\in A)$ using a Monte-Carlo method: we count the fraction of times (for $T$ large enough), $X_T/\mu^T$ fell into the set $A$.

The numerical validity of Theorem 2 is immediate to grasp².

#| '!! shinylive warning !!': |
#|   shinylive does not work in self-contained HTML documents.
#|   Please set `embed-resources: false` in your metadata.
#| standalone: true
#| components: [viewer, editor]
#| viewerHeight: 800
#| editorHeight: 120
#| layout: vertical
#| id: galtonwatson

from shiny import App, render, ui, reactive, req
import matplotlib.pyplot as plt
import numpy as np
import theme

# ========= 1. UI DEFINITION =========
app_ui = theme.layout(
    # 1. Sidebar definition (First argument)
    ui.sidebar(
        ui.h4("Simulation Controls"),
        ui.input_slider(
            "n_sims",
            "Number of Samples:",
            min=10, max=100, value=50, step=10
        ),
        ui.input_slider(
            "n_gen",
            "Generations (N):",
            min=5, max=80, value=40, step=5
        ),
        ui.input_select(
            "dist",
            "Offspring Distribution:",
            choices=["Poisson", "Binomial", "Geometric"]
        ),
        ui.panel_conditional(
            "input.dist === 'Poisson'",
            ui.input_slider(
                "poisson_lambda",
                "Expectation λ:",
                min=0.1, max=3.0, value=1.5, step=0.1
            )
        ),
        ui.panel_conditional(
            "input.dist === 'Binomial'",
            ui.input_slider(
                "binom_n",
                "Number of trials n:",
                min=1, max=10, value=2, step=1
            ),
            ui.input_slider(
                "binom_p",
                "Success probability p:",
                min=0.0, max=1.0, value=0.5, step=0.01
            )
        ),
        ui.panel_conditional(
            "input.dist === 'Geometric'",
            ui.input_slider(
                "geom_p",
                "Success probability p:",
                min=0.3, max=1.0, value=0.5, step=0.02
            )
        ),
        ui.input_action_button("run", "Run Simulation", class_="btn-success"),
        title="Galton-Watson Process"
    ),

    # 2. Main Content
    ui.output_ui("stats_display"),
    ui.output_plot("gw_plot", height="500px"),
    
    title="Galton-Watson Process Convergence"
)

# ========= 2. SERVER LOGIC =========
def server(input, output, session):
    theme.setup_plot_style()
    sim_results = reactive.Value(None)
    
    def get_dist_params():
        """Extract parameters and compute the expectation."""
        dist = input.dist()
        if dist == "Poisson":
            lam = input.poisson_lambda()
            return {"type": "Poisson", "lam": lam, "mu": lam}
        elif dist == "Binomial":
            n = input.binom_n()
            p = input.binom_p()
            return {"type": "Binomial", "n": n, "p": p, "mu": n * p}
        else:  # Geometric
            p = input.geom_p()
            # Mean of Geometric defined on {0, 1, ...} is (1-p)/p
            mu = (1.0 - p) / p
            return {"type": "Geometric", "p": p, "mu": mu}

    def calc_theoretical_extinction(params):
        """
        Calculate theoretical extinction probability by solving s = G(s).
        Uses Newton-Raphson for Poisson/Binomial and exact formula for Geometric.
        """
        mu = params['mu']
        if mu <= 1.0:
            return 1.0
        
        # Analytic solution for Geometric distribution
        if params['type'] == "Geometric":
            return 1.0 / mu
        
        # Newton-Raphson for other distributions: f(s) = G(s) - s = 0
        s = 0.5 # Initial guess
        for _ in range(20):
            if params['type'] == "Poisson":
                lam = params['lam']
                gs = np.exp(lam * (s - 1))
                g_prime = lam * gs
            elif params['type'] == "Binomial":
                n, p = params['n'], params['p']
                base = 1 - p + p * s
                gs = base ** n
                g_prime = n * p * (base ** (n - 1))
            else:
                return 1.0 
            
            f_val = gs - s
            f_prime = g_prime - 1
            
            if abs(f_val) < 1e-7:
                break
                
            s_new = s - f_val / f_prime
            s = s_new
            
        return max(0.0, min(1.0, s))

    @reactive.Effect
    @reactive.event(input.run, ignore_init=True)
    def run_simulation():
        n_sims = input.n_sims()
        n_gen = input.n_gen()
        params = get_dist_params()
        
        # Initialize Population Matrix (using float64)
        X = np.zeros((n_sims, n_gen + 1), dtype=np.float64)
        X[:, 0] = 1.0
        
        current_pop = np.ones(n_sims, dtype=np.float64)
        
        # Thresholds to switch from discrete sampling to Normal approximation
        BINOM_THRESHOLD = 2000000000  # ~2e9
        GEOM_THRESHOLD = 1e14
        
        # Safety limit for float precision to stop drawing lines
        SAFE_FLOAT_LIMIT = 1e250 
        
        try:
            for g in range(1, n_gen + 1):
                # Mask: only simulate for active populations (positive and finite)
                active_mask = (current_pop > 0) & np.isfinite(current_pop)
                if not np.any(active_mask):
                    break
                
                active_pop = current_pop[active_mask]
                new_counts = np.zeros_like(active_pop)
                
                if params["type"] == "Poisson":
                    lam_vec = active_pop * params["lam"]
                    new_counts = np.random.poisson(lam_vec)
                    
                elif params["type"] == "Binomial":
                    n_total = active_pop * params["n"]
                    p = params["p"]
                    
                    large_mask = n_total > BINOM_THRESHOLD
                    small_mask = ~large_mask
                    
                    if np.any(small_mask):
                        n_small = n_total[small_mask].astype(np.int32)
                        new_counts[small_mask] = np.random.binomial(n_small, p)
                        
                    if np.any(large_mask):
                        n_large = n_total[large_mask]
                        mu_large = n_large * p
                        std_large = np.sqrt(n_large * p * (1 - p))
                        sample = np.random.normal(mu_large, std_large)
                        # Ensure positivity after Normal approximation
                        new_counts[large_mask] = np.maximum(0.0, np.round(sample))
                    
                else: # Geometric
                    large_mask = active_pop > GEOM_THRESHOLD
                    small_mask = ~large_mask
                    
                    if np.any(small_mask):
                        n_small = active_pop[small_mask]
                        new_counts[small_mask] = np.random.negative_binomial(n_small, params["p"])
                    
                    if np.any(large_mask):
                        k = active_pop[large_mask]
                        p = params["p"]
                        mu_geom = k * (1.0 - p) / p
                        var_geom = k * (1.0 - p) / (p * p)
                        std_geom = np.sqrt(var_geom)
                        sample = np.random.normal(mu_geom, std_geom)
                        # Ensure positivity after Normal approximation
                        new_counts[large_mask] = np.maximum(0.0, np.round(sample))

                # Global positivity enforcement for safety
                new_counts = np.maximum(0.0, new_counts)
                
                # Check for numerical danger zone
                overflow_mask = new_counts > SAFE_FLOAT_LIMIT
                if np.any(overflow_mask):
                    new_counts[overflow_mask] = np.nan
                
                current_pop[active_mask] = new_counts
                X[:, g] = current_pop
            
            # Compute Martingale M_n = X_n / mu^n
            mu = params["mu"]
            if mu == 0:
                M_n = np.zeros_like(X)
                M_n[:, 0] = 1
            else:
                scalings = mu ** np.arange(n_gen + 1)
                M_n = X / scalings
            
            # Compute Statistics (ignoring NaNs from overflowed lines)
            # Extinction: check last valid value or check if 0 present
            # We assume NaN means "exploded", so not extinct.
            final_vals = X[:, -1]
            extinct_sim = np.sum(final_vals == 0) / n_sims
            extinct_theo = calc_theoretical_extinction(params)
            
            sim_results.set({
                "M_n": M_n,
                "extinction_prob": extinct_sim,
                "extinction_theo": extinct_theo,
                "mean_final_M": np.nanmean(M_n[:, -1]),
                "std_final_M": np.nanstd(M_n[:, -1]),
                "params": params,
                "n_gen": n_gen,
                "n_sims": n_sims
            })
            
        except Exception as e:
            sim_results.set({"error": str(e)})

    @output
    @render.ui
    def stats_display():
        results = sim_results.get()
        if not results:
            return ui.p("Click 'Run Simulation' to see results.", class_="text-center")
        
        if "error" in results:
            return ui.div(ui.h4("Error"), ui.p(str(results['error'])), class_="alert alert-danger")
        
        p = results["params"]
        
        return ui.card(
            ui.card_header(ui.h4("Simulation Results", class_="m-0")),
            ui.row(
                ui.column(4,
                    ui.h5("Configuration", class_="border-bottom pb-2"),
                    ui.tags.table(
                        ui.tags.tr(ui.tags.td(ui.strong("Type:")), ui.tags.td(p['type'])),
                        ui.tags.tr(ui.tags.td(ui.strong("Expectation (μ):")), ui.tags.td(f"{p['mu']:.3f}")),
                        class_="table table-sm table-borderless mb-0"
                    )
                ),
                ui.column(4,
                    ui.h5("Extinction Prob.", class_="border-bottom pb-2"),
                    ui.tags.table(
                        ui.tags.tr(ui.tags.td(ui.strong("Theoretical:")), ui.tags.td(f"{results['extinction_theo']:.4f}")),
                        ui.tags.tr(ui.tags.td(ui.strong("Simulated:")), ui.tags.td(f"{results['extinction_prob']:.3f}")),
                        class_="table table-sm table-borderless mb-0"
                    )
                ),
                ui.column(4,
                    ui.h5(f"Martingale (M_{results['n_gen']})", class_="border-bottom pb-2"),
                    ui.tags.table(
                        ui.tags.tr(ui.tags.td(ui.strong("Empirical mean:")), ui.tags.td(f"{results['mean_final_M']:.3f}")),
                        ui.tags.tr(ui.tags.td(ui.strong("Empirical std:")), ui.tags.td(f"{results['std_final_M']:.3f}")),
                        class_="table table-sm table-borderless mb-0"
                    )
                ),
                class_="align-items-start"
            )
        )
    
    @output
    @render.plot
    def gw_plot():
        results = sim_results.get()
        req(results)
        if "error" in results: return

        M_n = results["M_n"]
        n_gen = results["n_gen"]
        n_sims = results["n_sims"]
        params = results["params"]
        
        fig, ax = plt.subplots(figsize=(10, 6))
        
        line_width = 1.0 if n_sims > 50 else 1.5
        line_alpha = 0.5 if n_sims > 50 else 0.7
        
        x_axis = np.arange(n_gen + 1)
        # Matplotlib automatically stops drawing line segments containing NaN
        ax.plot(x_axis, M_n.T, alpha=line_alpha, linewidth=line_width)
        
        ax.axhline(y=1.0, color='#e74c3c', linestyle='--', linewidth=1.0, label='E[M_n] = 1')
        
        ax.set_xlabel("Generation (n)", fontsize=11)
        ax.set_ylabel(r"$M_n = X_n / \mu^n$", fontsize=11)
        
        crit_type = "Supercritical" if params['mu'] > 1 else "Critical" if params['mu'] == 1 else "Subcritical"
        ax.set_title(f"Convergence of {params['type']} Process ({crit_type}, μ={params['mu']:.2f})", fontsize=13)
        
        ax.legend(loc='upper right', frameon=True)
        ax.grid(True, linestyle=':', alpha=0.6)
        
        if params['mu'] > 1:
            # Robust scaling using nanpercentile to ignore overflowed paths
            final_vals = M_n[:, -1]
            if np.any(np.isfinite(final_vals)):
                y_max = np.nanpercentile(final_vals, 98) * 1.2
                if y_max > 0:
                    ax.set_ylim(0, y_max)
                
        return fig

app = App(app_ui, server)


## file: theme.py
from shiny import ui
import matplotlib.pyplot as plt

def layout(sidebar, *args, **kwargs):
    """
    Standard layout wrapper for all Shinylive apps.
    
    Args:
        sidebar: A ui.sidebar() object.
        *args: Main content elements.
        **kwargs: Additional arguments for ui.page_sidebar (e.g., title, fillable).
    """
    return ui.page_sidebar(
        sidebar,
        # Standard CSS across all apps
        ui.tags.style("""
            :root {
                --bs-body-font-family: "Inter", sans-serif;
                --bs-font-monospace: "Source Code Pro", monospace;
            }
            body { 
                font-family: var(--bs-body-font-family) !important; 
            }
            .bslib-sidebar-layout .sidebar-content { 
                font-size: 14px; 
            }
            /* Ensure plots blend in if needed */
            .shiny-plot-output {
                background-color: transparent;
            }
        """),
        *args,
        **kwargs
    )

def setup_plot_style():
    """
    Configures Matplotlib to match the web theme (Source Sans 3).
    """
    # Attempt to use Inter if available, else fallback to sans-serif
    plt.rcParams.update({
        "font.family": "sans-serif",
        "font.sans-serif": ["Inter", "Source Sans 3", "DejaVu Sans", "sans-serif"],
        "axes.titlesize": 13,
        "axes.labelsize": 11,
        "xtick.labelsize": 10,
        "ytick.labelsize": 10,
        "figure.figsize": (8, 5),
        "axes.grid": True,
        "grid.alpha": 0.3,
        "grid.linestyle": ":",
        # Transparent backgrounds to match web theme
        "figure.facecolor": "none",
        "axes.facecolor": "none",
        "savefig.facecolor": "none"
    })

When $\mu = 1$ (the critical case) and extinction occurs with probability $1$, one can study the behavior of the process conditioned on non-extinction. This result can be proved analytically using Lemma 1.

Theorem 3 (Conditional Limit Theorem) Consider a critical Galton-Watson process ($\mu= 1$, $\sigma^2 < \infty$). Then: \[ \mathbb{P}(X_t > 0) \sim \frac{2}{\sigma^2 t} \quad \text{as $t \to \infty$} \]

Furthermore, conditional on $\{\tau_0 >t\}$, the normalized population size $X_t/t$ converges in distribution to an exponential distribution: \[ \left(X_t/t \mid \tau_0>t\right) \xrightarrow{d} \mathrm{exp}\left(\frac{2}{\sigma^2}\right) \]

Extensions and Modern Applications

Generalizations

Galton-Watson processes serve as fundamental models with numerous generalizations, e.g.:

Multi-type Galton-Watson processes: Each individual can be one of several types, with type-dependent offspring distributions. The process is now a vector-valued Markov chain.
Continuous-time branching processes: The reproduction occurs at random times according to a Poisson process, leading to continuous-time Markov chains.
Spatial processes: Each individual is assigned a location on space, and the offspring distribution depends on the location.

While the historical roots of the Galton-Watson process are in genealogy, the model describes effectively any system where “items” produce “copies” of themselves independently.

Epidemiology and Viral Spreading

In modern epidemiology, the offspring distribution represents the number of secondary infections caused by a single infected individual. The mean $\mu$ corresponds to the basic reproduction number, $R_0$. Unlike standard SIR models which assume deterministic mixing, branching processes capture the stochasticity of early outbreaks. They are particularly useful for modeling “super-spreading” events, where the offspring distribution $(p_k)$ is heavy-tailed (e.g., $p_0$ is high, but $p_{k}$ for large $k$ is non-negligible). This helps calculate the probability that a virus introduced into a new population will die out naturally or cause an epidemic.

Polymerase Chain Reaction (PCR)

PCR is a molecular biology technique used to amplify a single copy of a DNA segment into millions of copies. In an ideal scenario, this is a binary Galton-Watson process with $p_2=1$ (deterministic doubling). In reality, the replication efficiency is $p < 1$. The process is modeled as a branching process where each DNA strand replicates with probability $p$ or fails with probability $1-p$. Branching process theory is used to estimate the variance of the final copy number, which is crucial for quantitative PCR (qPCR) accuracy. The basic Galton-Watson model must be refined to account for two physical realities:

Saturation: The branching cannot grow exponentially forever. As the population $X_t$ approaches the carrying capacity $K$ (depletion of reagents), the reproduction mean $\mu(X_t)$ drops below $1$.
Parameter Uncertainty: The replication efficiency $p$ is not known exactly, and may depend on the equipment conditions.

We model this by placing a prior distribution on the efficiency, for example $p \sim \mathrm{Beta}(\alpha, \beta)$, and assuming a density-dependent offspring distribution. For a population size $X_t$, the effective efficiency becomes $p_{eff}(X_t) = p \cdot (1 - X_t/K)$.

If we run $T$ cycles, for a successful detection, we need that the final amount of strands $X_T$ passes a certain value $\bar{x}$. An important question is: How many cycles $T$ must we run to ensure detection? In other words, we are interested in the hitting time $\tau := \inf \{t : X_t \ge \bar{x}\}$. Even if at larger values of $X_t$ a more deterministic behavior is in place (due to the law of large numbers), the early branching phase and the uncertainty in $p$ contribute to the variance of the random variable $\tau$.

If $\pi=\mathrm{Beta}(\alpha, \beta)$ is the prior on $p$, the probability of failure (false negative) at cycle $T$ is: \[ \mathbb{P}(\text{False Negative}) = \int_0^1 \mathbb{P}(\tau > T \mid p) \pi(p) dp \] As an example, we can we estimate the Cumulative Distribution Function (CDF) of $\tau$. We sample $p$ according to $\pi$, and for each sampled value we run a simulation of the Galton-Watson process with offspring distribution depending on $X_t$. For a fixed accepted risk of false negative $\varepsilon>0$, we then select the run-time $T^*$ such that $\mathbb{P}(\tau \le T^*) \ge 1 - \epsilon$ (e.g., 99%).

Note that the variance in the “exponential phase” (seen as the horizontal spread of the curves) directly translates to the uncertainty in the quantitative estimate. This horizontal shift corresponds to the random variable $M$ in the martingale limit $X_t \approx M \mu^t$. In the saturation regime, determining $M$ allows us to “back-calculate” the initial viral load.

It is meaningful here to plot make a logaritmic plot:

As long as $X_t$ is small compared to the carrier capacity $X_t \ll K$, we are in a classical Galton-Watson regime (offspring distribution constant at each generation). Thus we have \[ \log X_t \simeq \log M + t \log \mu \] We see this clearly in the log plot below, where we see over different simulations the “width” of $\log M$, while (after the very first iterations), a linear (for $\log X_t$, exponential for $X_t$) growth takes place with a coefficient independent of the sample.
As $X_t$ grows, we see the growth plateauing since we get closer to saturation and the offspring distribution loses efficiency.

Ideally, the prior distribution (the value of the parameters $\alpha$ and $\beta$) is provided by the manufacturer of the replication equipment. We can see the expected behavior emerge quite clearly in the Galton-Watson simulation. Actually we get values very similar to those used in laboratories for detection of low viral charges (as we are starting with just one)³.

#| '!! shinylive warning !!': |
#|   shinylive does not work in self-contained HTML documents.
#|   Please set `embed-resources: false` in your metadata.
#| standalone: true
#| components: [viewer]
#| viewerHeight: 1080
#| editorHeight: 120
#| layout: vertical
#| id: pcr-simulation

from shiny import App, render, ui, reactive, req
import matplotlib.pyplot as plt
import numpy as np
import theme

# ========= 1. UI DEFINITION =========
app_ui = theme.layout(
    ui.sidebar(
        ui.h4("PCR Parameters"),
        ui.input_slider("n_sims", "Number of Runs:", 50, 500, 100, step=50),
        
        ui.h5("Efficiency Prior", class_="mt-3"),
        ui.input_slider("alpha", "Alpha (Successes):", 1, 100, 40),
        ui.input_slider("beta", "Beta (Failures):", 1, 50, 7),
        ui.output_plot("prior_plot", height="150px"),
        
        ui.h5("Reaction Limits", class_="mt-3"),
        ui.input_slider("log_capacity", "Carrying Capacity (10^K):", 8, 12, 9, step=0.5),
        ui.input_slider("log_threshold", "Detection Threshold (10^x):", 4, 8, 7, step=0.5),
        
        ui.input_action_button("run", "Simulate PCR", class_="btn-primary w-100 mt-3"),
        title="PCR Cycle Optimizer"
    ),

    ui.card(
        ui.card_header(ui.h4("Simulation Results")),
        ui.output_plot("trace_plot", height="400px"),
        ui.output_plot("cdf_plot", height="250px"),
        ui.output_ui("recommendation_ui")
    )
)

# ========= 2. SERVER LOGIC =========
def server(input, output, session):
    theme.setup_plot_style()
    
    # Store simulation results
    results = reactive.Value(None)
    
    @output
    @render.plot
    def prior_plot():
        """Small helper plot to visualize the Beta prior in sidebar"""
        a, b = input.alpha(), input.beta()
        x = np.linspace(0, 1, 200)
        
        # The Beta PDF is: x^(a-1) * (1-x)^(b-1) / B(a,b)
        # It is not needed to compute B(a,b) since the box is small
        # showing scaling is not important
        # B(a,b) = gamma(a)*gamma(b) / gamma(a+b)        
        # beta_constant = (np.gamma(a) * np.gamma(b)) / np.gamma(a + b)

        # Calculate PDF manually
        # Add epsilon to x to avoid 0^negative errors, if sliders allowed <1
        y = (x**(a - 1) * (1 - x)**(b - 1))
        # / beta_constant
        
        fig, ax = plt.subplots(figsize=(4, 1.5))
        ax.plot(x, y, color='blue', lw=2)
        ax.fill_between(x, y, alpha=0.2, color='blue')
        ax.set_title(f"Prior E[p] = {a/(a+b):.2f}", fontsize=10)
        ax.set_yticks([])
        ax.set_xlim(0, 1)
        # Transparent background for sidebar integration
        fig.patch.set_alpha(0)
        ax.patch.set_alpha(0)
        plt.tight_layout()
        return fig

    @reactive.Effect
    @reactive.event(input.run, ignore_init=False) # Run once on startup
    def run_simulation():
        # 1. Setup Parameters
        N = input.n_sims()
        a, b = input.alpha(), input.beta()
        K = 10 ** input.log_capacity()
        threshold = 10 ** input.log_threshold()
        T_max = 45
        
        # 2. Vectorized Simulation
        # Sample unique efficiency 'p' for each run from Prior
        p_base = np.random.beta(a, b, N)
        
        # Initialize population matrix (Sims x Time)
        X = np.zeros((N, T_max))
        X[:, 0] = 1.0
        
        current_pop = np.ones(N)
        
        # Thresholds for approximations
        BINOM_CUTOFF = 1000
        
        for t in range(T_max - 1):
            # Calculate density-dependent efficiency
            # Clip at 0 to prevent negative efficiency if overshot
            p_eff = p_base * (1.0 - current_pop / K)
            p_eff = np.maximum(0.0, p_eff)
            
            # Update step
            new_copies = np.zeros(N)
            
            # A. Small numbers: Use Binomial (exact branching)
            mask_small = current_pop < BINOM_CUTOFF
            if np.any(mask_small):
                n_small = current_pop[mask_small].astype(int)
                p_small = p_eff[mask_small]
                new_copies[mask_small] = np.random.binomial(n_small, p_small)
                
            # B. Large numbers: Use Normal Approximation (faster)
            mask_large = ~mask_small
            if np.any(mask_large):
                n_large = current_pop[mask_large]
                p_large = p_eff[mask_large]
                mu = n_large * p_large
                std = np.sqrt(n_large * p_large * (1.0 - p_large))
                new_copies[mask_large] = np.random.normal(mu, std)
            
            new_copies = np.maximum(0.0, new_copies)
            current_pop += new_copies
            X[:, t+1] = current_pop
            
        # 3. Analyze Hitting Times
        has_crossed = X >= threshold
        
        # Handle cases that never crossed
        did_cross = np.any(has_crossed, axis=1)
        # argmax returns first True index
        crossing_times = np.argmax(has_crossed, axis=1).astype(float)
        crossing_times[~did_cross] = np.nan
        
        results.set({
            "X": X,
            "times": crossing_times,
            "threshold": threshold,
            "K": K,
            "did_cross": did_cross
        })

    @output
    @render.plot
    def trace_plot():
        res = results.get()
        req(res)
        
        X = res["X"]
        threshold = res["threshold"]
        K = res["K"]
        
        fig, ax = plt.subplots(figsize=(8, 5))
        
        # Plot a subset of traces (max 100 to avoid clutter)
        n_plot = min(100, X.shape[0])
        ax.plot(X[:n_plot].T, color='blue', alpha=0.15, lw=1)
        
        # Plot Threshold and Capacity
        ax.axhline(threshold, color='red', linestyle='--', linewidth=2, label='Detection Threshold')
        ax.axhline(K, color='green', linestyle=':', linewidth=2, label='Saturation (K)')
        
        ax.set_yscale('log')
        ax.set_ylim(0.8, K * 2) 
        ax.set_ylabel("DNA Copies (Log Scale)")
        ax.set_xlabel("Cycle Number")
        ax.set_title("Stochastic PCR Amplification Traces")
        ax.legend(loc='upper left')
        ax.grid(True, which="both", ls="-", alpha=0.2)
        
        return fig

    @output
    @render.plot
    def cdf_plot():
        res = results.get()
        req(res)
        
        times = res["times"]
        did_cross = res["did_cross"]
        valid_times = times[did_cross]
        
        fig, ax = plt.subplots(figsize=(8, 3))
        
        if len(valid_times) == 0:
            ax.text(0.5, 0.5, "No detection occurred", ha='center')
            return fig
            
        # Empirical CDF
        sorted_times = np.sort(valid_times)
        y_vals = np.arange(1, len(sorted_times) + 1) / len(times) # Normalize by TOTAL simulations
        
        ax.step(sorted_times, y_vals, where='post', color='darkorange', lw=2)
        
        # Add 99% line
        target_prob = 0.99
        idx = np.searchsorted(y_vals, target_prob)
        
        if idx < len(sorted_times):
            t_safe = sorted_times[idx]
            ax.axvline(t_safe, color='black', linestyle='--')
            ax.axhline(target_prob, color='black', linestyle=':')
            ax.text(t_safe + 0.5, 0.5, f"T* = {int(t_safe)} cycles\n(99% Confidence)", 
                   verticalalignment='center')
        
        ax.set_ylabel("Detection Prob.")
        ax.set_xlabel("Cycle Number")
        ax.set_ylim(0, 1.05)
        ax.set_xlim(left=0)
        ax.grid(True, alpha=0.3)
        ax.set_title("CDF of Detection Time τ")
        
        plt.tight_layout()
        return fig

    @output
    @render.ui
    def recommendation_ui():
        res = results.get()
        req(res)
        
        times = res["times"]
        did_cross = res["did_cross"]
        
        if np.sum(did_cross) == 0:
            return ui.div("Signal never reached threshold.", class_="alert alert-warning")
            
        # Calculate T* for 99%
        sorted_times = np.sort(times[did_cross])
        n_total = len(times)
        
        # We need the 99th percentile of the WHOLE distribution
        idx_99 = int(n_total * 0.99)
        
        if idx_99 >= len(sorted_times):
            msg = f"Only {len(sorted_times)/n_total:.1%} of runs succeeded. Cannot guarantee 99% detection."
            cls = "alert alert-danger"
        else:
            t_star = int(sorted_times[idx_99])
            msg = ui.markdown(f"**Recommendation:** Run protocol for **{t_star} cycles** to ensure < 1% false negative rate.")
            cls = "alert alert-success"
            
        return ui.div(msg, class_=cls)

app = App(app_ui, server)


## file: theme.py
from shiny import ui
import matplotlib.pyplot as plt

def layout(sidebar, *args, **kwargs):
    """
    Standard layout wrapper for all Shinylive apps.
    
    Args:
        sidebar: A ui.sidebar() object.
        *args: Main content elements.
        **kwargs: Additional arguments for ui.page_sidebar (e.g., title, fillable).
    """
    return ui.page_sidebar(
        sidebar,
        # Standard CSS across all apps
        ui.tags.style("""
            :root {
                --bs-body-font-family: "Inter", sans-serif;
                --bs-font-monospace: "Source Code Pro", monospace;
            }
            body { 
                font-family: var(--bs-body-font-family) !important; 
            }
            .bslib-sidebar-layout .sidebar-content { 
                font-size: 14px; 
            }
            /* Ensure plots blend in if needed */
            .shiny-plot-output {
                background-color: transparent;
            }
        """),
        *args,
        **kwargs
    )

def setup_plot_style():
    """
    Configures Matplotlib to match the web theme (Source Sans 3).
    """
    # Attempt to use Inter if available, else fallback to sans-serif
    plt.rcParams.update({
        "font.family": "sans-serif",
        "font.sans-serif": ["Inter", "Source Sans 3", "DejaVu Sans", "sans-serif"],
        "axes.titlesize": 13,
        "axes.labelsize": 11,
        "xtick.labelsize": 10,
        "ytick.labelsize": 10,
        "figure.figsize": (8, 5),
        "axes.grid": True,
        "grid.alpha": 0.3,
        "grid.linestyle": ":",
        # Transparent backgrounds to match web theme
        "figure.facecolor": "none",
        "axes.facecolor": "none",
        "savefig.facecolor": "none"
    })

Algorithmic Analysis

In computer science, Galton-Watson processes appear in the probabilistic analysis of algorithms and data structures. For instance, the height of random binary search trees or the recursion depth of the Quicksort algorithm can be analyzed by embedding these discrete structures into continuous-time branching processes. The “offspring” here correspond to the recursive sub-problems generated by a partition step.

Neutron Transport Theory

Although an older application, it remains vital for nuclear safety. In a nuclear reactor, a neutron colliding with a nucleus may induce fission, releasing more neutrons. This is a Galton-Watson process where $\mu$ is the criticality factor.

$\mu < 1$: Subcritical (reaction dies out).
$\mu = 1$: Critical (steady power output).
$\mu > 1$: Supercritical (power increases exponentially, potentially dangerous). Safety calculations rely on computing the probability of extinction (fission chain termination) versus exponential runaway.

Footnotes

As usual, hereafter $\tau_0=\inf\{t \ge 0: X_t = 0\}$ is the hitting time of $0$.↩︎
Feel free to edit the code to include different sampling methods, resizing the editor.↩︎
Feel free to edit the code resizing the editor box below.↩︎