Theory of Deep Learning III: explaining the non-overfitting puzzle