Grounded representations through deep variational inference and dynamic programming