Reinforcement learning is a machine learning paradigm in which software agents use a process of trial and error to learn how to complete tasks in a way that maximizes cumulative rewards as defined by their programmers. In contrast to supervised learning paradigms, reinforcement learning systems do not need labeled input/output pairs or explicit corrections of suboptimal actions; and, in contrast to unsupervised learning, reinforcement learning defines an explicit goal, which is the maximization of the value returned by the Q-learning (or “quality” learning) algorithm as a result of its actions.
Because it combines the goal orientation of supervised learning with the flexibility of unsupervised learning, reinforcement learning is very important in creating artificial intelligence (AI) applications requiring successful problem-solving in complex situations. For example, they are often used in financial engineering to develop optimal trading algorithms for the stock market. They are also used to build intelligent systems to allow robots and self-driving cars to navigate real-world environments safely.