SARSA (State-Action-Reward-State-Action) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning.
Although I know that SARSA is on-policy while Q-learning is off-policy, when looking at their formulas it's hard (to me) …
artificial-intelligence reinforcement-learning q-learning sarsaThe difference between Q-learning and SARSA is that Q-learning compares the current state and the best possible next state, whereas …
reinforcement-learning q-learning sarsa