Amari \(m\)-geodesic and dual flatness

Wiki card — concept node for exp-families-stability.

(i) Formal statement

Let \(\mathcal{P}\) be a manifold of probability densities. Amari–Nagaoka define two dual affine connections on \(\mathcal{P}\):

An exponential family \(\mathcal{E}_\varphi = \{p_\theta \propto e^{-\theta^\top \varphi}\}\) is \(e\)-flat (canonical parameters are \(e\)-affine). A family is totally \(m\)-geodesic in \(\mathcal{P}\) if for every pair \(p_0, p_1 \in \mathcal{E}\), the \(m\)-geodesic between them stays in \(\mathcal{E}\).

Pythagorean theorem. For \(p \in \mathcal{E}_\varphi\) (\(e\)-flat), \(q \in \mathcal{M}\) (\(m\)-flat), and $r = $ foot of \(\nabla^{(e)}\)-projection of \(q\) onto \(\mathcal{E}_\varphi\):

\[ \mathrm{KL}(q \| p) \;=\; \mathrm{KL}(q \| r) \;+\; \mathrm{KL}(r \| p). \tag{Pyth} \]

(ii) Role in Q1 / Q2 / Q3

(iii) References

(iv) Worked miniature — Gaussian family is \(m\)-geodesic-stable under convolution

Consider \(\mathcal{E} = \{\mathcal{N}(\mu, \Sigma) : \Sigma \succ 0\}\) on \(\mathbb{R}^d\), parametrised by \(\varphi(x) = (\mathrm{vec}(xx^\top), x)\). The canonical \(\theta\)-parameter is \((K, h)\) with density \(\propto \exp(-\tfrac12 x^\top K x + h^\top x)\); the expectation \(\eta\)-parameter is \((\mathbb{E}[xx^\top], \mathbb{E}[x]) = (\Sigma + \mu\mu^\top, \mu)\).

Closure under convolution. If \(X \sim \mathcal{N}(\mu, \Sigma)\) and \(Y \sim \mathcal{N}(0, \sigma^2 I)\) independent, then \(X + Y \sim \mathcal{N}(\mu, \Sigma + \sigma^2 I)\). In \(\theta\)-coordinates \((K, h) = (\Sigma^{-1}, \Sigma^{-1}\mu)\):

\[ (K, h) \;\longmapsto\; \bigl((K^{-1} + \sigma^2 I)^{-1},\; (K^{-1} + \sigma^2 I)^{-1} K^{-1} h\bigr). \]

This is the Q2 reparameterisation \(\theta \mapsto \tilde\theta\) in closed form. In the time-variable \(t = \sigma^2/2\), infinitesimally \(\dot K = -K^2\) (Riccati), \(\dot h = -K h\) — the flow (3.5) of the invariant reformulation specialised to \((\mathbb{R}^d, \tfrac12\Delta)\).

Information-geometric read. The Fokker–Planck semigroup acts on \(\mathcal{P}\), but when restricted to the \(e\)-flat submanifold \(\mathcal{E}_{\mathrm{Gauss}}\) it remains in \(\mathcal{E}_{\mathrm{Gauss}}\) — i.e. the Gaussian family is stable (invariant submanifold) under \(P_t^*\). The flow is a curve in \(\eta\)-coordinates that is not a straight \(m\)-geodesic (it is nonlinear in \(\eta\)), but it stays in \(\mathcal{E}_{\mathrm{Gauss}}\). This is the precise sense in which Q1 is a submanifold-invariance question, not a geodesic question.