Type of learning, inspired by the operant conditioning technique of rewarding desired behavior and ignoring unwanted behavior
reinforcement learning
In reinforcement learning, an agent learns through this process of trying actions and learning from rewards or mistakes
trial and error
What are Rosies Three Built in Actions
Forward, take a step Backward, and she can Kick
When Rosie finally gets a reward for kicking the ball, what exactly does she learn from that experience?
that being next to the ball and choosing Kick is a good action
This Sony robot dog, often used in robot soccer, can walk, kick, and even wag its plastic tail using built-in sensors, motors, and a camera
Aibo
In reinforcement learning, this term refers to the predicted amount of future reward an agent expects to receive
value of an action (or action value)
In reinforcement learning, why is it important that Rosie doesn’t learn too much from a single reward?
to avoid forming “superstitions,” or false connections between actions and rewards
In 2016, reinforcement learning gained worldwide attention when it powered this AI program that defeated the world’s top players in this game
GO
Programmers might teach a Sony Aibo to walk toward and kick the ball by giving it a set of explicit instructions like “take a step toward the ball” and “kick the ball.” This is know as what kind of training
Rule-Based AI
Rosie hasn’t learned anything yet and is described as a “tabula rasa.” What does this term mean in the context of reinforcement learning?
a blank slate with no prior knowledge or experience?