The weight-decay technique in learning from data: An optimization point of view