AI Lexicon — R
Published May 17, 2024last updated May 17, 2024Reinforcement learning (RL)
Reinforcement learning (RL) is a method of machine learning. It improves an AI model's capabilities by a process of trial and error.
RL is used in systems that make sequential decisions to complete a task, such as playing complex games.
The system looks at data from games and evaluates which moves, or sequences of moves, lead to wins and losses. And since the system is "trained” to know that the goal of competitive games is winning, it focuses on those sequences that lead to victory.
An AI trained with RL can learn from its own mistakes: it receives feedback on its actions, either positive or negative feedback — or punishment or reward — with its aim being to optimize reward.
Examples of AI models trained with reinforcement learning include Pluribus, a poker-playing bot, and DeepMind's AlphaGo, which plays Go. Both programs have beaten the world's top human players for each game. (za/fs)
Sources:
Machine Learning Glossary (Google) https://developers.google.com/machine-learning/glossary (accessed July 24, 2023)
What is reinforcement learning? (University of York) https://online.york.ac.uk/what-is-reinforcement-learning/ (accessed July 25, 2023)
What is reinforcement learning? (IBM Developer) https://developer.ibm.com/learningpaths/get-started-automated-ai-for-decision-making-api/what-is-automated-ai-for-decision-making/ (accessed July 25, 2023)
No human could do that: Is AI becoming too alien? (DW/Schwaller) https://www.dw.com/en/no-human-could-do-that-is-ai-becoming-too-alien/a-63253727 (accessed May 17, 2024)
Read the rest of DW's AI Lexicon:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
We're keen to hear your feedback. Suggest an entry by sending us a comment. And let us know if you feel we have missed something, got it wrong, and tell us whether our AI Lexicon has helped you understand the technology better.
Written and edited by: Zulfikar Abbany (za), Fred Schwaller (fs)