Effective assisted living environments must be able to perform inferences on how their occupants interact with their environment. Gaze direction provides strong indications of how people interact with their surroundings. In this paper, we propose a gaze tracking method that uses a neural network regressor to estimate gazes from keypoints and integrates them over time using a moving average mechanism. Our gaze regression model uses confidence gated units to handle cases of keypoint occlusion and estimate its own prediction uncertainty. Our temporal approach for gaze tracking incorporates these prediction uncertainties as weights in the moving average scheme. Experimental results on a dataset collected in an assisted living facility demonstrate that our gaze regression network performs on par with a complex, dataset-specific baseline, while its uncertainty predictions are highly correlated with the actual angular error of corresponding estimations. Finally, experiments on videos sequences show that our temporal approach generates more accurate and stable gaze predictions.
Keypoint-based gaze tracking
Odone F.
2021-01-01
Abstract
Effective assisted living environments must be able to perform inferences on how their occupants interact with their environment. Gaze direction provides strong indications of how people interact with their surroundings. In this paper, we propose a gaze tracking method that uses a neural network regressor to estimate gazes from keypoints and integrates them over time using a moving average mechanism. Our gaze regression model uses confidence gated units to handle cases of keypoint occlusion and estimate its own prediction uncertainty. Our temporal approach for gaze tracking incorporates these prediction uncertainties as weights in the moving average scheme. Experimental results on a dataset collected in an assisted living facility demonstrate that our gaze regression network performs on par with a complex, dataset-specific baseline, while its uncertainty predictions are highly correlated with the actual angular error of corresponding estimations. Finally, experiments on videos sequences show that our temporal approach generates more accurate and stable gaze predictions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.