Verification and repair of control policies for safe reinforcement learning