This doctoral dissertation focuses on designing, developing, and evaluating methodologies for representing and understanding human motion and humans in the scene (interactions with objects or in small groups). The motivation arises from the growing interest in vision-based solutions which can understand and anticipate human behaviours, not only for robotics but also for surveillance or rehabilitation applications. Associated with specific objectives of this thesis, three distinct methodologies are presented: \textit{MOSAIC} which focuses on hierarchical motion representation for action recognition; \textit{ACROSS} which leverages the complexity of activity recognition using scene-graphs representations; \textit{HHP-Net} which is a Head Pose Estimation Network with a focus on interpretability and computational costs. Experiments and discussions support all the presented methods to highlight their strengths and weaknesses and are assessed using state-of-the-art benchmark datasets.

Understanding Humans in Videos: From Pose to Action Recognition

FIGARI TOMENOTTI, FEDERICO
2024-05-23

Abstract

This doctoral dissertation focuses on designing, developing, and evaluating methodologies for representing and understanding human motion and humans in the scene (interactions with objects or in small groups). The motivation arises from the growing interest in vision-based solutions which can understand and anticipate human behaviours, not only for robotics but also for surveillance or rehabilitation applications. Associated with specific objectives of this thesis, three distinct methodologies are presented: \textit{MOSAIC} which focuses on hierarchical motion representation for action recognition; \textit{ACROSS} which leverages the complexity of activity recognition using scene-graphs representations; \textit{HHP-Net} which is a Head Pose Estimation Network with a focus on interpretability and computational costs. Experiments and discussions support all the presented methods to highlight their strengths and weaknesses and are assessed using state-of-the-art benchmark datasets.
23-mag-2024
Computer Vision; Deep Learning; Machine Learning; Action Recognition;
File in questo prodotto:
File Dimensione Formato  
phdunige_4109657.pdf

accesso aperto

Descrizione: Understanding Humans in Videos From Pose to Action Recognition
Tipologia: Tesi di dottorato
Dimensione 4.26 MB
Formato Adobe PDF
4.26 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1176195
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact