Proof for the graph diffusion equation in GRAND: Graph Neural Diffusion
Published:
When I read the GRAND: Graph Neural Diffusion (arxiv) by Chamberlain et al., I had problems following section 3.1. Graph diffusion equation because the derivation steps of the final equation is not very clear
\[\frac{\partial}{\partial t} \mathbf{x}(t) = (\mathbf{A}(\mathbf{x}(t)) - \mathbf{I})\mathbf{x}(t)\]Although there is a slide online about the diffusion equation on graph, the formulation is not identical, so I decided to write a detailed post to derive the equation in the GRAND paper.
The divergence for a node $i$ is defined as
\[(div(\mathbf{X}))_{i} = \sum_{j:(i, j) \in E} \mathbf{X}_{ij} = \sum_{j=1}^{n} \mathbf{w}_{ij} \mathbf{X}_{ij}\]The gradient is defined as
\[{(\nabla \mathbf{x})}_{ij} = \mathbf{x}_{j} - \mathbf{x}_{i}\]We have the following diffusion equation on the graph.
\[div(G(x(t), t), t) \nabla x(t)) = \sum_{j=1}^{n} \mathbf{w}_{ij} A_{ij} \nabla x(ij)\]with
\[G = diag(a(\mathbf{x}_{i}(t), \mathbf{x}_{j}(t),t))\]which is an $e \times e$ diagonal matrix and $a$ is a function determining the similarity between the two vertices $i$, $j$.
An additional step that needs to be made explicit is
\[(G(\mathbf{x}(t), t) \nabla \mathbf{x}(t))_u = \sum_{v} G_{uv} \nabla x_{v}\]Because $G$ is a diagonal matrix, we have that
\[\sum_{v} G_{uv} \nabla x_{v} = G_{uu} \nabla x_u\]Each index $u$ chosen here represent an edge in the graph, so we can find the two indices $i, j$ of the two vertices corresponding to that graph, the term becomes
\[G_{uu} \nabla x_u = \mathbf{A}_{ij} (\mathbf{x}_{j} - \mathbf{x}_{i})\]Note that $G \in \mathbb{R}^{e \times e}$, $\nabla \mathbf{x} \in \mathbb{R}^{e \times 1}$, $\mathbf{A} \in \mathbb{R}^{v \times v}$
Substituting the above equation into the definition of $div$, we get
\[div(\mathbf{X}_{i}) = \sum_{j=1}^{n} \mathbf{w}_{ij} \mathbf{A}_{ij} (\mathbf{x}_{j} - \mathbf{x}_{i})\]If we rearrange the term, the above equation becomes
\[\sum_{j=1}^{n} \mathbf{w}_{ij} \mathbf{A}_{ij} \mathbf{x}_{j} - \sum_{j=1}^{n} \mathbf{w}_{ij} \mathbf{A}_{ij} \mathbf{x}_{i}\]Shortening the left term while noticing that $\mathbf{w} \odot A$ is left stochastic (that means its sum is $\mathbf{I}$), we get
\[[\mathbf{w} \odot \mathbf{A}] \mathbf{x} - \mathbf{x}_{i}\]Substitute this into the diffusion equation for a vertex $i$, we have
\[\frac{\partial}{\partial t}\mathbf{x}_{i}(t) = [\mathbf{w} \odot \mathbf{A}] \mathbf{x} - \mathbf{x}_{i}\]Vectorize this give us
\[div(\mathbf{x}_{i}) = (\mathbf{w} \odot \mathbf{A}) \mathbf{x} - \mathbf{x}\]Since $\mathbf{A} > 0$ only when $\mathbf{w} = 1$, we get the equation we want to prove
\[div(\mathbf{x}) = \mathbf{A} \mathbf{x} - \mathbf{x} = (\mathbf{A} - \mathbf{I}) \mathbf{x}\]