Cramér–Rao bound and Fisher information

Wiki card — concept node for exp-families-stability.

(i) Formal statement

For a parametric family $\{p_\theta : \theta \in \Theta\}$ with $p_\theta(x) = Z_\theta^{-1} e^{-\theta^\top\varphi(x)}$, the Fisher information matrix (parameter Fisher info) is

\[ I(\theta) \;:=\; -\mathbb{E}_\theta[\nabla^2_\theta \log p_\theta(X)] \;=\; \nabla^2_\theta \log Z_\theta \;=\; \mathrm{Cov}_\theta(\varphi(X)), \tag{FIM} \]

the log-partition Hessian / covariance of the sufficient statistic.

Cramér–Rao bound. For any unbiased estimator $\hat\theta(X)$, $\mathrm{Cov}(\hat\theta) \succeq I(\theta)^{-1}$. Saturation in exponential families via the MLE.

Location Fisher information. For a density $p$ on $\mathbb{R}^d$, \[ J(X) \;:=\; \int p(x)\, \|\nabla \log p(x)\|^2\, dx, \tag{J} \] the Fisher information with respect to a translation parameter. Distinct from (FIM) in general, but intimately linked on exponential families: if $\varphi$ is affine equivariant (e.g. $\varphi(x) = x$ or $(x, xx^\top)$), location Fisher factors through parameter Fisher.

(ii) Role in Q1 / Q2 / Q3

Q1 / Q3 (analytic identity). De Bruijn's identity $\tfrac{d}{dt}h(X_t) = \tfrac12 J(X_t)$ gives $J$ a dynamical meaning along the heat flow. Stability requires $J(X_t)$ to remain expressible in $\theta_t$; Cramér–Rao / FIM relates this to the log-partition curvature $\nabla^2 \log Z_{\theta_t}$.
Q2. For Gaussian $\mathcal{N}(0, \Sigma)$, $I(K) = \tfrac12 (K^{-1}\otimes K^{-1})$ (in vectorised precision), and $J(X) = \mathrm{tr}(K) = \mathrm{tr}(\Sigma^{-1})$. Along the flow $\Sigma_t = \Sigma_0 + t I$: $J(X_t) = \mathrm{tr}((\Sigma_0
- t I)^{-1})$, which is expressible in $\theta_t$ — direct verification of Q2 stability via (dB).
Information-geometric link. FIM is the Riemannian metric on the statistical manifold $\mathcal{E}_\varphi$ (Amari–Nagaoka §3). Stability of $\mathcal{E}_\varphi$ under heat flow corresponds to the heat-flow vector field being tangential in this Fisher metric.

(iii) References

Brown (1986) — Fundamentals of Statistical Exponential Families. Canonical treatment of FIM for exp-families, convex duality with the Legendre transform of $\log Z$. doi:10.1214/lnms/1215466757.
Barndorff-Nielsen (1978) — Information and Exponential Families in Statistical Theory. Steepness, convex duality. doi:10.1002/9781118857281.
Amari, Nagaoka (2000) — Methods of Information Geometry. FIM as Riemannian metric.
Dembo, Cover, Thomas (1991) — Information-Theoretic Inequalities. FIM, Cramér–Rao, Stam, EPI. doi:10.1109/18.104312.

(iv) Worked miniature — FIM and location Fisher info on Gaussians

Take $p_\theta = \mathcal{N}(0, 1/\theta)$ on $\mathbb{R}$, with $\theta > 0$. So $p_\theta(x) \propto e^{-\theta x^2/2}$, $\varphi(x) = x^2/2$, $\log Z_\theta = -\tfrac12 \log\theta + \tfrac12\log(2\pi)$.

Parameter Fisher. $\nabla_\theta^2 \log Z_\theta = 1/(2\theta^2)$, so $I(\theta) = 1/(2\theta^2)$. Equivalently, $\mathrm{Var}_\theta(x^2/2) = \tfrac14 \mathrm{Var}_\theta(x^2) = \tfrac14 \cdot 2 \sigma^4 = \tfrac14 \cdot 2/\theta^2 = 1/(2\theta^2)$. ✓

Location Fisher. $\nabla_x \log p = -\theta x$, so $J = \mathbb{E}_\theta[\theta^2 X^2] = \theta^2 \cdot 1/\theta = \theta$.

Heat flow on $J$. Under $X_t = X + \sqrt{t}\, Z$, $X_t \sim \mathcal{N}(0, 1/\theta + t)$, so $J(X_t) = 1/(1/\theta + t) = \theta/(1 + t\theta)$.

De Bruijn check: $\tfrac{d}{dt} h(X_t) = \tfrac12 J(X_t)$. Here $h(X_t) = \tfrac12 \log(2\pi e (1/\theta + t))$, $\tfrac{d}{dt} h = \tfrac{1}{2(1/\theta

t)} = \tfrac12 J(X_t)$. ✓

Cramér–Rao in action. The MLE of $\theta$ from $n$ iid samples is $\hat\theta = n/\sum X_i^2$, with $\mathrm{Var}(\hat\theta) \to 1/(n I(\theta)) = 2\theta^2/n$ — classic result, saturating the bound.

Link to Q3 / stability. Along the heat flow with $\varphi(x) = x^2/2$, the parameter flows as $\theta_t = \theta_0 / (1 + t\theta_0)$, i.e. $1/\theta_t = 1/\theta_0 + t$ — exactly the $\eta$-coordinate linear flow (FIM Riemannian structure pushed to $\eta$-affine), which is Riccati $\dot\theta = -\theta^2$ in dual coordinates. FIM is the hinge between the dynamical (HJ, Riccati) and informational (entropy, Fisher) pictures.