What Is Deep Reinforcement Learning?

Written by Coursera Staff • Updated on

Deep reinforcement learning is a subset of machine learning that results in nuanced insights. Learn more about deep reinforcement learning, including asynchronous methods for deep reinforcement learning and deep reinforcement learning tutorials.

[Featured Image] A video game developer sits at a desk and incorporates deep reinforcement learning into a project.

When you merge a structure of reinforcement learning with an artificial neural network, you create a deep reinforcement learning system, this allows you to punish and reward to train your algorithm. To further clarify, here is an example of reinforcement learning:

  • Imagine you’re sitting in front of a campfire for the first time. You place a marshmallow on a stick over the flames and watch it turn golden and gooey. After you eat it, you decide to place a second marshmallow over the flames using your fingers. The flames singe you, and you drop the marshmallow in the fire. The next time you put a marshmallow into the fire, use a stick like the first time.

This scenario is an example of reinforcement learning or the process of learning through rewards and penalties. For computers, deep reinforcement learning is a similar process of developing good, or accurate, decisions over time. 

Additionally, the modern-day interest in applying deep learning to machines increased for three reasons: the amount of data available, superior computing power, and more complex algorithms. As a result, deep reinforcement learning algorithms have already contributed to computer vision, natural language processing, and medical diagnostics. Learn more about deep reinforcement learning, including asynchronous methods for deep reinforcement learning and deep reinforcement learning tutorials.

What is deep reinforcement learning?

Like the marshmallow-roasting example, deep reinforcement learning is a process where a computer uses rewards and penalties to learn the next best action to take to achieve a specific goal. This process allows the computer to learn the same way humans do, by taking in data and observing our environment before making a decision. Like the human brain, these artificial neural networks, while operating in uncertainty like humans, use deep reinforcement learning algorithms to study immense data sets for goals such as outcome prediction, control, and learning to model the environment. This means that computers, much like humans, can learn, adapt, and change based on the results they receive.

What is deep reinforcement learning used for?

Deep reinforcement learning finds uses across various industries to support and improve human activity. You’ve likely seen or even interacted with this technology in industries such as self-driving cars, natural language processing, automated robotics, image processing, and recommendation systems. Deep reinforcement learning finds use in industries where immense data sets are generated constantly, because these programs require huge volumes of information to run trial-and-error equations successfully.

How does deep reinforcement learning work?

Deep reinforcement learning works by using frameworks known as artificial neural networks. These networks build up layers of nodes that mimic how neurons function in your brain. The nodes process and pass information along the networks, using trial and error to discover accurate results.

In deep reinforcement learning, you call the strategy the computer develops, based on feedback, to produce these results a policy. These policies (decisions) inform themselves by the state of the computer, which is its current situation, and the action set, which is the different options the computer chooses from. Selecting from these options, also known as “search,” allows the computer to think and consider different actions, and then observe the results of its different choices. Because deep reinforcement learning allows for the coordination of learning, decision-making, and representation, this technology, oddly enough, might create new insights into how the human brain functions.

Deep reinforcement learning is unique in that the structure of the software provides the opportunity for it to learn much like your brain does. It’s made up of thousands of layers of neural networks that take in unlabeled, unstructured data and make sense of its contents without needing a human to direct the learning process. 


If your goal is to teach a robot to walk up a set of stairs, the computer might decide to take a step that ends up being too big. The resulting “punishment” of a fall is negative feedback that the computer uses to adjust its next step to a smaller one. Some scientists use virtual environments for the robot to learn so that it can test different options and fall repeatedly without risking damage to real, expensive robotics parts. When you combine the robot’s experience of trial-and-error reinforcement learning with artificial neural networks and new data integration of deep learning, you develop a deep reinforcement learning system.


Who uses deep reinforcement learning?

Professionals use deep reinforcement learning across various industries to employ autonomous robotics to further their goals and objectives. You may have interacted with this technology in some industries, including self-driving cars, social media recommendations, virtual assistants, language processing, and medical imaging. Professionals like data scientists use deep reinforcement learning to solve problems and address business needs.

Pros and cons of using deep reinforcement learning

Some of the pros of using deep reinforcement learning surface in various industries—such as business and health care—that you might interact with daily. For businesses, deep reinforcement learning allows your company to create optimized workflows that are accurate and reflect the nuances of your particular business. As computers grow better at learning and making decisions, you’ll see more personalized media recommendations, more accurate language translations, and safer self-driving cars. Deep reinforcement learning is key to advancing artificial intelligence and its ability to support and improve the human experience in health care, marketing, technology, and more.

A con of deep reinforcement learning, if you want it to work properly, is that the software system requires an immense amount of data. This data might be expensive to gather and store, and if it’s not valuable or large enough, it might result in inaccurate or non-optimal results and insights.

How to get started in deep reinforcement learning

If you’re interested in learning more about deep reinforcement learning, the first step is to look for online guides, courses, and resources. These opportunities give you the chance to practice with deep-reinforcement-learning tutorials and algorithms.

One example of a career that includes deep reinforcement learning is a machine learning engineer. In this position, you would create artificial intelligence programs designed to run independently of human involvement. Typically, you would work with teams of other data professionals. To become a machine learning engineer, you’ll most likely need a bachelor’s degree in a subject such as computer science. Fortunately, if you pursue this career, the average annual salary of a machine learning engineer in the US is $156,075[1].

Getting started with Coursera

Sharpen your deep-reinforcement-learning skills and discover more about the foundational knowledge required for a machine learning and AI career with the courses and degrees on Coursera. An option such as IBM’s Deep Learning and Reinforcement Learning course, which is a part of the IBM Machine Learning Professional Certificate on Coursera, provides you with the opportunity to develop key artificial intelligence skills and set you on the path of a machine learning or other data science career. Explore this course and more on Coursera today.

Article sources

  1. Glassdoor. “How much does a Machine Learning Engineer Make?, https://www.glassdoor.com/Salaries/machine-learning-engineer-salary-SRCH_KO0,25.htm” Accessed March 20, 2024.

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.