In this work, we introduce a planning neural operator (PNO) for predicting the value function of a motion planning problem. We recast value function approximation as learning a single operator from the cost function space to the value function space, which is defined by an Eikonal partial differential equation (PDE). Therefore, our PNO model, despite being trained with a finite number of samples at coarse resolution, inherits the zero-shot super-resolution property of neural operators. We demonstrate accurate value function approximation at 16x the training resolution on the MovingAI lab’s 2D city dataset, compare with state-of-the-art neural value function predictors on 3D scenes from the iGibson building dataset and showcase optimal planning with 4-joint robotic manipulators. Lastly, we investigate employing the value function output of PNO as a heuristic function to accelerate motion planning. We show theoretically that the PNO heuristic is -consistent by introducing an inductive bias layer that guarantees our value functions satisfy the triangle inequality. With our heuristic, we achieve a 30% decrease in nodes visited while obtaining near optimal path lengths on the MovingAI lab 2D city dataset, compared to classical planning methods (A*, RRT*).
As PNO produces an estimate of the value function, one can deploy PNO as a neural heuristic in A* reducing the nodes explored compared to the Euclidean Norm.
As PNO is a neural operator, we guarentee the existence of a neural operator approximation to the optimal value function within any \(\epsilon>0\) tolerance. Namely, we introduce two main results (with a formal restating of one background result):
Consider the motion planning problem: $$ \begin{align} \min_{\pi} \;\;& c(x(\tau)) + \int_0^{\tau} c(x(t)) dt \,, \nonumber \\ \text{s.t. }& \tau = \inf\{t \in \mathbb{R}_{\geq 0} \mid x(t) \in \mathcal{G}\}\,, \nonumber \\ & \dot{x}(t) = \pi(x(t))\,, \; \bm{x}(0) = \bm{x}_0\,, \nonumber \\ & x(t) \in \mathcal{S}, \quad \|\pi(x(t))\| = 1 \quad \forall t \geq 0\,. \nonumber \end{align} $$ where \(c\) is the cost function, \(x\) is the location or state taken values in a set \(\mathcal{S}\), \(\mathcal{G}\) is a set of goal states and \(\pi\) are the dynamics/controls of \(x\). The value function of the motion planning problem is equivalent to solving the following Eikonal PDE: $$ \begin{align} \|\nabla V(x)\| &= c(x) \,, \quad \forall x \in \mathcal{S} \backslash \mathcal{G}\,, \nonumber \\ V(x) &= c(x) \,, \quad \forall x \in \mathcal{G} \nonumber\,. \end{align} $$
For any \(\epsilon > 0\), there exists a PNO approximation of the value function \(\hat{V}\) such that $$ \begin{align} \sup_{x \in \mathcal{S}} \|V(x) - \hat{V}(x)\| <= \epsilon \,. \nonumber \end{align} $$
Let \(V(x, g)\) now be the value function solving the Eikonal PDE when the goal is a single point. A \(\epsilon\)-consistent is any heuristic that satisfies \(h(x) \leq \epsilon V(x, y) + h(y)\) where \(y\) is any desired goal position. Under the setting above, with operator approximation error \(\epsilon_{\text{NO}}\), the heuristic generated by PNO is \(\epsilon\) consistent where $$ \begin{align} \epsilon = \max_{\{x, y \in \mathcal{S} \mid x \neq y\}} 1 + 2 \epsilon_{\text{NO}} / V(x, y)\nonumber \,. \end{align} $$ In other words, this result says that the paths gerenated by A* with the PNO heuristic are \(\epsilon\) close to the optimal path where \(\epsilon\) is only a function of the operator error and the max value of the value function.
@inproceedings{matada2025generalizable,
title={Generalizable Motion Planning via Operator Learning},
author={Sharath Matada and Luke Bhan and Yuanyuan Shi and Nikolay Atanasov},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=UYcUpiULmT}
}