Optimal Scene Graph Planning with Large Language Model Guidance

Zhirui Dai¹

Arash Asgharivaskasi¹

Thai Duong¹

Shusen Lin¹

Maria-Elizabeth Tzes²

George Pappas²

Nikolay Atanasov¹

¹Contextual Robotics Institute
University of California, San Diego

²GRASP Lab
University of Pennsylvania, Philadelphia

ICRA 2024

[Paper]

[arXiv]

[Code]

Recent advances in metric, semantic, and topological mapping have equipped autonomous robots with semantic concept grounding capabilities to interpret natural language tasks. This work aims to leverage these new capabilities with an efficient task planning algorithm for hierarchical metric-semantic models. We consider a scene graph representation of the environment and utilize a large language model (LLM) to convert a natural language task into a linear temporal logic (LTL) automaton. Our main contribution is to enable optimal hierarchical LTL planning with LLM guidance over scene graphs. To achieve efficiency, we construct a hierarchical planning domain that captures the attributes and connectivity of the scene graph and the task automaton, and provide semantic guidance via an LLM heuristic function. To guarantee optimality, we design an LTL heuristic function that is provably consistent and supplements the potentially inadmissible LLM guidance in multi-heuristic planning. We demonstrate efficient planning of complex natural language tasks in scene graphs of virtualized real environments.

Paper

Video

Acknowledgements

We gratefully acknowledge support from ARL DCIST CRA W911NF-17-2-0181 and ONR N00014-23-1-2353.
This webpage template was borrowed from https://akanazawa.github.io/cmr/.