Task planning is a popular approach for autonomous agents due to its understandability, predictability, and ease of deployment. However, it is difficult to scale in real-world, human-robot cooperation scenarios due to the poor performance scaling in complex planning domains and when frequent re-planning is needed. Longer planning times can hinder the robot's efficiency and adversely affect the interaction's fluency. Our objective in this PhD project is to develop novel methods to address this issue and favor keeping task planning in the execution loop as much as possible. First, we explore the use of traditional planning techniques, and in particular the use of macros, to optimize total planning and execution time. Macros are known to reduce planning time, but at the cost of plan optimality and thus execution time. We provide evidence that by selecting an appropriate level of macro abstraction and by implementing ad-hoc grounding for said macros, it is possible to reduce average planning time by 85% with little impact on execution time. Then, we proceed to explore more innovative approaches based on the latest advancement in generative AI. In particular, we propose a method, Teriyaki, to bridge the gap between symbolic task planning and machine learning methods, by training Large Language Models (LLMs), namely GPT-3, into a neurosymbolic planner compatible with the Planning Domain Definition Language (PDDL). Potential benefits include better scalability, as LLMs' response time scales with the combined length of the input and the output regardless of the symbols involved; and the ability to generate a plan action-by-action, which in turn enables simultaneous planning and execution, reducing wait times. In the past year, significant effort has been devoted by the AI community to evaluate the overall cognitive abilities of LLMs, but success rate has been limited. Instead, we focus on providing a success rate comparable to traditional planners in specific planning domains, while improving other real-world metrics. Preliminary results in two domains selected from those developed in the first part of this project, show that our method can: (i) solve 95.5% of problems in a test data set of 1000 samples, a result comparable with that of the baseline heuristic-search planner; (ii) produce plans up to 13.5% shorter than a traditional planner; (iii) reduce average waiting times for a plan by 61,4% and its standard deviation by 96.6% through parallel planning and execution.

Keep the planner in the loop: parallel planning and execution using Large Language Models

CAPITANELLI, ALESSIO
2024-05-28

Abstract

Task planning is a popular approach for autonomous agents due to its understandability, predictability, and ease of deployment. However, it is difficult to scale in real-world, human-robot cooperation scenarios due to the poor performance scaling in complex planning domains and when frequent re-planning is needed. Longer planning times can hinder the robot's efficiency and adversely affect the interaction's fluency. Our objective in this PhD project is to develop novel methods to address this issue and favor keeping task planning in the execution loop as much as possible. First, we explore the use of traditional planning techniques, and in particular the use of macros, to optimize total planning and execution time. Macros are known to reduce planning time, but at the cost of plan optimality and thus execution time. We provide evidence that by selecting an appropriate level of macro abstraction and by implementing ad-hoc grounding for said macros, it is possible to reduce average planning time by 85% with little impact on execution time. Then, we proceed to explore more innovative approaches based on the latest advancement in generative AI. In particular, we propose a method, Teriyaki, to bridge the gap between symbolic task planning and machine learning methods, by training Large Language Models (LLMs), namely GPT-3, into a neurosymbolic planner compatible with the Planning Domain Definition Language (PDDL). Potential benefits include better scalability, as LLMs' response time scales with the combined length of the input and the output regardless of the symbols involved; and the ability to generate a plan action-by-action, which in turn enables simultaneous planning and execution, reducing wait times. In the past year, significant effort has been devoted by the AI community to evaluate the overall cognitive abilities of LLMs, but success rate has been limited. Instead, we focus on providing a success rate comparable to traditional planners in specific planning domains, while improving other real-world metrics. Preliminary results in two domains selected from those developed in the first part of this project, show that our method can: (i) solve 95.5% of problems in a test data set of 1000 samples, a result comparable with that of the baseline heuristic-search planner; (ii) produce plans up to 13.5% shorter than a traditional planner; (iii) reduce average waiting times for a plan by 61,4% and its standard deviation by 96.6% through parallel planning and execution.
28-mag-2024
AI; generative; Neurosymbolic; Large Language Model; Task Planning; PDDL; human-robot interaction; gpt
File in questo prodotto:
File Dimensione Formato  
phdunige_4134440.pdf

accesso aperto

Tipologia: Tesi di dottorato
Dimensione 9.87 MB
Formato Adobe PDF
9.87 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1175155
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact