This PhD thesis focuses on the development of a cloud system designed for facilitating diversity-aware, situated, multi-party autonomous interactions between humans and artificial agents. The objective is to empower robots to actively engage in conversations, adapting to individual needs and preferences, and mimicking behaviors commonly observed in interactions among multiple humans. The specific functionalities include speaker recognition, conversation state and user statistics monitoring, and, when required, assuming the role of a moderator. To achieve this goal, the Cloud Artificial Intelligence and Robotics (CAIR) system has been developed, using an ontology for knowledge-based autonomous interaction between conversational agents and humans. This design allows for flexibility and expansion by incorporating new services to enhance the capabilities of connected clients. To empower robots with the ability to interact with groups of people, a speaker recognition mechanism has been implemented. The data collected during conversations have been used to develop several control policies, determining which speaker to address and controlling the dynamics of the conversation. The system is equipped with the ability to adapt to diverse populations and individuals according to the concept of “diversity-awareness,'' encompassing factors like background, personality, age, gender, and culture. To achieve a diversity-aware conversation, the initial step involved modifying the ontology by adjusting the probabilities of certain conversation topics and refining sentences to mitigate potential discomfort. Subsequently, the integration of large language models (LLMs) into the system facilitated the realization of a diversity-aware conversation. This was achieved by providing the model with comprehensive information about users, the conversation's history, contextual details, and specific guidelines for generating responses. To enhance diversity-awareness in responses, a pioneering approach involved endowing robots with the ability to perceive and interpret visual surroundings through textual descriptions. It is essential to emphasize that LLMs were used as a tool, with the conversation flow remaining anchored in the structure of the system's pre-established ontology. Preliminary experiments were conducted to evaluate the CAIR cloud server's ability to handle numerous simultaneous requests while maintaining a low response time. The results of these experiments formed the basis for appropriately sizing the system, presenting a sustainable solution for both verbal and non-verbal interactions with low-cost robots and other smart devices. The multi-party capabilities of the CAIR system were assessed through experiments in a middle school involving 300 participants grouped into fours. In these experiments, the robot assumed the role of a moderator, implementing different policies. The results indicated effective control of group conversation dynamics in terms of balancing user participation and reducing the number of subgroups. Notably, participants reported positive interaction experiences regardless of the control policy employed. To evaluate the impact of a diversity-aware system, experiments were conducted in a hospital setting with 10 clinicians and 10 individuals with spinal cord injuries (SCI). The system’s knowledge base was carefully adapted to people with SCI with the help of healthcare staff. The findings supported the necessity of using diversity-aware robots, showing that people with SCI reported lower anxiety and higher enjoyment levels than clinicians. Additionally, the users' perception of the robot remained consistent in longer interactions, demonstrating sustained effectiveness despite reduced novelty. After integrating LLMs into the system, experiments were carried out to evaluate the system's performance in real-world scenarios and measure various performance indicators. The findings confirmed the effectiveness of the implemented system, facilitating diversity-aware conversations that leverage the strengths of a tailored knowledge base along with powerful tools like LLMs and techniques for extracting data from visual information. Clients for the CAIR cloud have been developed for a variety of devices, including computers, Android smartphones, Aldebaran robots NAO and Pepper, the Einstein robot by Hanson Robotics, and the AlterEgo robot designed by the Italian Institute of Technology (IIT). This showcases the ease of connecting devices to CAIR and leveraging its capabilities. The system's adaptability extends to diverse contexts, such as education, healthcare, retail, fairs, and homes. Its versatility enables it to provide companionship, support individuals with specific needs, enhance learning experiences in educational settings, and entertain groups of people in public contexts.

A Cloud System for Diversity-Aware, Situated, Multi-Party Autonomous Interaction Between Humans and Robots

GRASSI, LUCREZIA
2024-04-24

Abstract

This PhD thesis focuses on the development of a cloud system designed for facilitating diversity-aware, situated, multi-party autonomous interactions between humans and artificial agents. The objective is to empower robots to actively engage in conversations, adapting to individual needs and preferences, and mimicking behaviors commonly observed in interactions among multiple humans. The specific functionalities include speaker recognition, conversation state and user statistics monitoring, and, when required, assuming the role of a moderator. To achieve this goal, the Cloud Artificial Intelligence and Robotics (CAIR) system has been developed, using an ontology for knowledge-based autonomous interaction between conversational agents and humans. This design allows for flexibility and expansion by incorporating new services to enhance the capabilities of connected clients. To empower robots with the ability to interact with groups of people, a speaker recognition mechanism has been implemented. The data collected during conversations have been used to develop several control policies, determining which speaker to address and controlling the dynamics of the conversation. The system is equipped with the ability to adapt to diverse populations and individuals according to the concept of “diversity-awareness,'' encompassing factors like background, personality, age, gender, and culture. To achieve a diversity-aware conversation, the initial step involved modifying the ontology by adjusting the probabilities of certain conversation topics and refining sentences to mitigate potential discomfort. Subsequently, the integration of large language models (LLMs) into the system facilitated the realization of a diversity-aware conversation. This was achieved by providing the model with comprehensive information about users, the conversation's history, contextual details, and specific guidelines for generating responses. To enhance diversity-awareness in responses, a pioneering approach involved endowing robots with the ability to perceive and interpret visual surroundings through textual descriptions. It is essential to emphasize that LLMs were used as a tool, with the conversation flow remaining anchored in the structure of the system's pre-established ontology. Preliminary experiments were conducted to evaluate the CAIR cloud server's ability to handle numerous simultaneous requests while maintaining a low response time. The results of these experiments formed the basis for appropriately sizing the system, presenting a sustainable solution for both verbal and non-verbal interactions with low-cost robots and other smart devices. The multi-party capabilities of the CAIR system were assessed through experiments in a middle school involving 300 participants grouped into fours. In these experiments, the robot assumed the role of a moderator, implementing different policies. The results indicated effective control of group conversation dynamics in terms of balancing user participation and reducing the number of subgroups. Notably, participants reported positive interaction experiences regardless of the control policy employed. To evaluate the impact of a diversity-aware system, experiments were conducted in a hospital setting with 10 clinicians and 10 individuals with spinal cord injuries (SCI). The system’s knowledge base was carefully adapted to people with SCI with the help of healthcare staff. The findings supported the necessity of using diversity-aware robots, showing that people with SCI reported lower anxiety and higher enjoyment levels than clinicians. Additionally, the users' perception of the robot remained consistent in longer interactions, demonstrating sustained effectiveness despite reduced novelty. After integrating LLMs into the system, experiments were carried out to evaluate the system's performance in real-world scenarios and measure various performance indicators. The findings confirmed the effectiveness of the implemented system, facilitating diversity-aware conversations that leverage the strengths of a tailored knowledge base along with powerful tools like LLMs and techniques for extracting data from visual information. Clients for the CAIR cloud have been developed for a variety of devices, including computers, Android smartphones, Aldebaran robots NAO and Pepper, the Einstein robot by Hanson Robotics, and the AlterEgo robot designed by the Italian Institute of Technology (IIT). This showcases the ease of connecting devices to CAIR and leveraging its capabilities. The system's adaptability extends to diverse contexts, such as education, healthcare, retail, fairs, and homes. Its versatility enables it to provide companionship, support individuals with specific needs, enhance learning experiences in educational settings, and entertain groups of people in public contexts.
24-apr-2024
Cloud robotics; Diversity-aware robots; Multi-party interaction
File in questo prodotto:
File Dimensione Formato  
phdunige_4223595.pdf

embargo fino al 24/04/2025

Tipologia: Tesi di dottorato
Dimensione 7.25 MB
Formato Adobe PDF
7.25 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1170516
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact