Mathematical foundations. Reinforcement Learning: State-of-the-Art. Q -learning was introduced by Chris Watkins in The core of the algorithm is a Bellman equation as a simple value iteration update , using the weighted average of the current value and the new information: [4]. Communications of the ACM.
nest...