Abstract: As one efficient technique in reinforcement learning, policy iteration (PI) requires an initial admissible (or stabilizing for linear systems) control policy that renders the existing ...