Cramér–Rao bound and Fisher information

Wiki card — concept node for exp-families-stability.

(i) Formal statement

For a parametric family \(\{p_\theta : \theta \in \Theta\}\) with \(p_\theta(x) = Z_\theta^{-1} e^{-\theta^\top\varphi(x)}\), the Fisher information matrix (parameter Fisher info) is

\[ I(\theta) \;:=\; -\mathbb{E}_\theta[\nabla^2_\theta \log p_\theta(X)] \;=\; \nabla^2_\theta \log Z_\theta \;=\; \mathrm{Cov}_\theta(\varphi(X)), \tag{FIM} \]

the log-partition Hessian / covariance of the sufficient statistic.

Cramér–Rao bound. For any unbiased estimator \(\hat\theta(X)\), \(\mathrm{Cov}(\hat\theta) \succeq I(\theta)^{-1}\). Saturation in exponential families via the MLE.

Location Fisher information. For a density \(p\) on \(\mathbb{R}^d\), \[ J(X) \;:=\; \int p(x)\, \|\nabla \log p(x)\|^2\, dx, \tag{J} \] the Fisher information with respect to a translation parameter. Distinct from (FIM) in general, but intimately linked on exponential families: if \(\varphi\) is affine equivariant (e.g. \(\varphi(x) = x\) or \((x, xx^\top)\)), location Fisher factors through parameter Fisher.

(ii) Role in Q1 / Q2 / Q3

(iii) References

(iv) Worked miniature — FIM and location Fisher info on Gaussians

Take \(p_\theta = \mathcal{N}(0, 1/\theta)\) on \(\mathbb{R}\), with \(\theta > 0\). So \(p_\theta(x) \propto e^{-\theta x^2/2}\), \(\varphi(x) = x^2/2\), \(\log Z_\theta = -\tfrac12 \log\theta + \tfrac12\log(2\pi)\).

Parameter Fisher. \(\nabla_\theta^2 \log Z_\theta = 1/(2\theta^2)\), so \(I(\theta) = 1/(2\theta^2)\). Equivalently, \(\mathrm{Var}_\theta(x^2/2) = \tfrac14 \mathrm{Var}_\theta(x^2) = \tfrac14 \cdot 2 \sigma^4 = \tfrac14 \cdot 2/\theta^2 = 1/(2\theta^2)\). ✓

Location Fisher. \(\nabla_x \log p = -\theta x\), so \(J = \mathbb{E}_\theta[\theta^2 X^2] = \theta^2 \cdot 1/\theta = \theta\).

Heat flow on \(J\). Under \(X_t = X + \sqrt{t}\, Z\), \(X_t \sim \mathcal{N}(0, 1/\theta + t)\), so \(J(X_t) = 1/(1/\theta + t) = \theta/(1 + t\theta)\).

De Bruijn check: \(\tfrac{d}{dt} h(X_t) = \tfrac12 J(X_t)\). Here \(h(X_t) = \tfrac12 \log(2\pi e (1/\theta + t))\), $\tfrac{d}{dt} h = \tfrac{1}{2(1/\theta

Cramér–Rao in action. The MLE of \(\theta\) from \(n\) iid samples is \(\hat\theta = n/\sum X_i^2\), with \(\mathrm{Var}(\hat\theta) \to 1/(n I(\theta)) = 2\theta^2/n\) — classic result, saturating the bound.

Link to Q3 / stability. Along the heat flow with \(\varphi(x) = x^2/2\), the parameter flows as \(\theta_t = \theta_0 / (1 + t\theta_0)\), i.e. \(1/\theta_t = 1/\theta_0 + t\) — exactly the \(\eta\)-coordinate linear flow (FIM Riemannian structure pushed to \(\eta\)-affine), which is Riccati \(\dot\theta = -\theta^2\) in dual coordinates. FIM is the hinge between the dynamical (HJ, Riccati) and informational (entropy, Fisher) pictures.