user9856153
user9856153

Reputation:

Why is RL called 'reinforcement' learning?

I understand why machine learning is named as such, and on top of that the nomenclature behind supervised and unsupervised learning. So what is reinforced about reinforcement learning?

Upvotes: 6

Views: 1698

Answers (3)

Naeem Khoshnevis
Naeem Khoshnevis

Reputation: 2472

Modern reinforcement learning is built upon two main threads. One thread concerns learning by trial and error and originated in the psychology of animal learning. The second thread concerns the problem of optimal control, and it is a solution using value functions and dynamic programming ( Sutton and Barto., 2018). Reinforcement learning borrowed his name from the first thread of studies. According to Watkins (1989), in studying the animals' ability to learn, the animals may be automatically provided with reinforcers. In behavioral terms, a positive reinforcer might be a morsel of food for a hungry animal, for instance, or a sip of water for a thirsty animal. Conversely, a negative reinforcer might be an electric shock.

PS. Watkins proposed the Q-learning algorithm.

Edit: (Added more history)

According to Sutton and Barto (2018): "The term “reinforcement” in the context of animal learning came into use well after Thorndike’s expression of the Law of Effect, first appearing in this context (to the best of our knowledge) in the 1927 English translation of Pavlov’s monograph on conditioned reflexes. Pavlov described reinforcement as the strengthening of a pattern of behavior due to an animal receiving a stimulus—a reinforcer—in an appropriate temporal relationship with another stimulus or with a response."

Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.
Thorndike, E. L. Animal Intelligence. Hafner, Darien, CT, 1911.
Watkins, Christopher John Cornish Hellaby. "Learning from delayed rewards." (1989).

Upvotes: 1

Bashman
Bashman

Reputation: 23

Reinforcement learning is reinforced through trial and error. Outcomes which are incorrect (or less than optimal) do not need to be manually corrected. Instead, the focus is on exploration, and feedback (reinforcement) is obtained from these same experiences.

Upvotes: 0

R.F. Nelson
R.F. Nelson

Reputation: 2312

The “reinforcement” in reinforcement learning refers to how certain behaviors are encouraged, and others discouraged. Behaviors are reinforced through rewards which are gained through experiences with the environment.

Upvotes: 5

Related Questions