Motor skill learning has different components. When we acquire a new motor skill we have both to learn a reliable action-value map to select a highly rewarded action (task model) and to develop an internal representation of the novel dynamics of the task environment, in order to execute properly the action previously selected (internal model). Here we focus on a 'pure' motor skill learning task, in which adaptation to a novel dynamical environment is negligible and the problem is reduced to the acquisition of an action-value map, only based on knowledge of results. Subjects performed point-to-point movement, in which start and target positions were fixed and visible, but the score provided at the end of the movement depended on the distance of the trajectory from a hidden viapoint. Subjects did not have clues on the correct movement other than the score value. The task is highly redundant, as infinite trajectories are compatible with the maximum score. Our aim was to capture the strategies subjects use in the exploration of the task space and in the exploitation of the task redundancy during learning. The main findings were that (i) subjects did not converge to a unique solution; rather, their final trajectories are determined by subject-specific history of exploration. (ii) with learning, subjects reduced the trajectory's overall variability, but the point of minimum variability gradually shifted toward the portion of the trajectory closer to the hidden via-point.

Reward-based learning of a redundant task

TAMAGNONE, IRENE;CASADIO, MAURA;SANGUINETI, VITTORIO
2013-01-01

Abstract

Motor skill learning has different components. When we acquire a new motor skill we have both to learn a reliable action-value map to select a highly rewarded action (task model) and to develop an internal representation of the novel dynamics of the task environment, in order to execute properly the action previously selected (internal model). Here we focus on a 'pure' motor skill learning task, in which adaptation to a novel dynamical environment is negligible and the problem is reduced to the acquisition of an action-value map, only based on knowledge of results. Subjects performed point-to-point movement, in which start and target positions were fixed and visible, but the score provided at the end of the movement depended on the distance of the trajectory from a hidden viapoint. Subjects did not have clues on the correct movement other than the score value. The task is highly redundant, as infinite trajectories are compatible with the maximum score. Our aim was to capture the strategies subjects use in the exploration of the task space and in the exploitation of the task redundancy during learning. The main findings were that (i) subjects did not converge to a unique solution; rather, their final trajectories are determined by subject-specific history of exploration. (ii) with learning, subjects reduced the trajectory's overall variability, but the point of minimum variability gradually shifted toward the portion of the trajectory closer to the hidden via-point.
2013
9781467360241
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/698154
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact