When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 5 modules in this course
Reinforcement Learning is a subfield of Machine Learning, but is also a general purpose formalism for automated decision-making and AI. This course introduces you to statistical learning techniques where an agent explicitly takes actions and interacts with the world. Understanding the importance and challenges of learning agents that make decisions is of vital importance today, with more and more companies interested in interactive agents and intelligent decision-making.
This course introduces you to the fundamentals of Reinforcement Learning. When you finish this course, you will:
- Formalize problems as Markov Decision Processes
- Understand basic exploration methods and the exploration/exploitation tradeoff
- Understand value functions, as a general-purpose tool for optimal decision-making
- Know how to implement dynamic programming as an efficient solution approach to an industrial control problem
This course teaches you the key concepts of Reinforcement Learning, underlying classic and modern algorithms in RL. After completing this course, you will be able to start using RL for real problems, where you have or can specify the MDP.
This is the first course of the Reinforcement Learning Specialization.
Welcome to: Fundamentals of Reinforcement Learning, the first course in a four-part specialization on Reinforcement Learning brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, get a flavour of what the course has in store for you, and be given an in-depth roadmap to help make your journey through this specialization as smooth as possible.
What's included
4 videos2 readings1 discussion prompt
Show info about module content
4 videos•Total 20 minutes
Specialization Introduction•3 minutes
Course Introduction•6 minutes
Meet your instructors!•8 minutes
Your Specialization Roadmap•3 minutes
2 readings•Total 20 minutes
Reinforcement Learning Textbook•10 minutes
Read Me: Pre-requisites and Learning Objectives•10 minutes
1 discussion prompt•Total 10 minutes
Meet and Greet!•10 minutes
An Introduction to Sequential Decision-Making
Module 2•4 hours to complete
Module details
For the first week of this course, you will learn how to understand the exploration-exploitation trade-off in sequential decision-making, implement incremental algorithms for estimating action-values, and compare the strengths and weaknesses to different algorithms for exploration. For this week’s graded assessment, you will implement and test an epsilon-greedy agent.
Jonathan Langford: Contextual Bandits for Real World Reinforcement Learning•9 minutes
Week 1 Summary•3 minutes
3 readings•Total 70 minutes
Module 1 Learning Objectives•10 minutes
Weekly Reading•30 minutes
Chapter Summary•30 minutes
1 assignment•Total 45 minutes
Sequential Decision-Making•45 minutes
1 programming assignment•Total 30 minutes
Bandits and Exploration/Exploitation•30 minutes
1 discussion prompt•Total 10 minutes
Compare bandits to supervised learning•10 minutes
2 plugins•Total 30 minutes
Let's play a game!•15 minutes
What's underneath?•15 minutes
Markov Decision Processes
Module 3•3 hours to complete
Module details
When you’re presented with a problem in industry, the first and most important step is to translate that problem into a Markov Decision Process (MDP). The quality of your solution depends heavily on how well you do this translation. This week, you will learn the definition of MDPs, you will understand goal-directed behavior and how this can be obtained from maximizing scalar rewards, and you will also understand the difference between episodic and continuing tasks. For this week’s graded assessment, you will create three example tasks of your own that fit into the MDP framework.
Examples of Episodic and Continuing Tasks•3 minutes
Week 2 Summary•2 minutes
2 readings•Total 40 minutes
Module 2 Learning Objectives•10 minutes
Weekly Reading•30 minutes
1 assignment•Total 45 minutes
MDPs•45 minutes
1 peer review•Total 60 minutes
Graded Assignment: Describe Three MDPs•60 minutes
1 discussion prompt•Total 10 minutes
Is the reward hypothesis sufficient?•10 minutes
Value Functions & Bellman Equations
Module 4•3 hours to complete
Module details
Once the problem is formulated as an MDP, finding the optimal policy is more efficient when using value functions. This week, you will learn the definition of policies and value functions, as well as Bellman equations, which is the key technology that all of our algorithms will use.
Rich Sutton and Andy Barto: A brief History of RL•8 minutes
Bellman Equation Derivation•6 minutes
Why Bellman Equations?•5 minutes
Optimal Policies•8 minutes
Optimal Value Functions•5 minutes
Using Optimal Value Functions to Get Optimal Policies•8 minutes
Week 3 Summary•4 minutes
3 readings•Total 53 minutes
Module 3 Learning Objectives•10 minutes
Weekly Reading•30 minutes
Chapter Summary•13 minutes
2 assignments•Total 90 minutes
[Practice] Value Functions and Bellman Equations•45 minutes
[Graded] Value Functions and Bellman Equations•45 minutes
1 discussion prompt•Total 10 minutes
Check-in•10 minutes
Dynamic Programming
Module 5•4 hours to complete
Module details
This week, you will learn how to compute value functions and optimal policies, assuming you have the MDP model. You will implement dynamic programming to compute value functions and optimal policies and understand the utility of dynamic programming for industrial applications and problems. Further, you will learn about Generalized Policy Iteration as a common template for constructing algorithms that maximize reward. For this week’s graded assessment, you will implement an efficient dynamic programming agent in a simulated industrial control problem.
The University of Alberta is considered among the world’s leading public research- and teaching-intensive universities, known for excellence across the humanities, sciences, creative arts, business, engineering and health sciences.
As one of Canada’s top universities, we are investing in purpose-built online post-secondary education—rooted in innovative digital pedagogies, world-class faculty, exceptional design, and a championed student experience.
The Alberta Machine Intelligence Institute (Amii) is home to some of the world’s top talent in machine intelligence. We’re an Alberta-based
research institute that pushes the bounds of academic knowledge and guides business understanding of artificial intelligence and machine learning.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.8
2,901 reviews
5 stars
81.77%
4 stars
14.30%
3 stars
2.61%
2 stars
0.44%
1 star
0.86%
Showing 3 of 2901
S
SM
5·
Reviewed on May 6, 2023
Excellent course, with a very nice presentation style, both the professors are excellent in their presentations and the material is well researched and delivered. A very valuable course.
U
U
4·
Reviewed on Jan 2, 2021
The book is essential reading. It took me longer than the estimates to do the reading and the programming assignments. I would have liked more gridworld examples to get a faster hang of it.
R
RD
5·
Reviewed on Apr 25, 2020
I was so confused about the fundamental concepts, but doing this course has given me a solid foundation of RL.This is a must-do course if you are starting with Reinforcement Learning.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.