About this Course

170,283 recent views

Learner Career Outcomes

25%

started a new career after completing these courses

11%

got a tangible career benefit from this course
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Intermediate Level

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

Approx. 15 hours to complete
English

What you will learn

  • Formalize problems as Markov Decision Processes

  • Understand basic exploration methods and the exploration / exploitation tradeoff

  • Understand value functions, as a general-purpose tool for optimal decision-making

  • Know how to implement dynamic programming as an efficient solution approach to an industrial control problem

Skills you will gain

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems

Learner Career Outcomes

25%

started a new career after completing these courses

11%

got a tangible career benefit from this course
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Intermediate Level

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

Approx. 15 hours to complete
English

Offered by

Placeholder

University of Alberta

Placeholder

Alberta Machine Intelligence Institute

Syllabus - What you will learn from this course

Content RatingThumbs Up93%(10,350 ratings)Info
Week
1

Week 1

1 hour to complete

Welcome to the Course!

1 hour to complete
4 videos (Total 20 min), 2 readings
4 videos
Course Introduction5m
Meet your instructors!8m
Your Specialization Roadmap3m
2 readings
Reinforcement Learning Textbook10m
Read Me: Pre-requisites and Learning Objectives10m
4 hours to complete

An Introduction to Sequential Decision-Making

4 hours to complete
8 videos (Total 46 min), 3 readings, 2 quizzes
8 videos
Learning Action Values4m
Estimating Action Values Incrementally5m
What is the trade-off?7m
Optimistic Initial Values6m
Upper-Confidence Bound (UCB) Action Selection5m
Jonathan Langford: Contextual Bandits for Real World Reinforcement Learning8m
Week 1 Summary3m
3 readings
Module 1 Learning Objectives10m
Weekly Reading30m
Chapter Summary30m
1 practice exercise
Sequential Decision-Making45m
Week
2

Week 2

3 hours to complete

Markov Decision Processes

3 hours to complete
7 videos (Total 36 min), 2 readings, 2 quizzes
7 videos
Examples of MDPs4m
The Goal of Reinforcement Learning3m
Michael Littman: The Reward Hypothesis12m
Continuing Tasks5m
Examples of Episodic and Continuing Tasks3m
Week 2 Summary1m
2 readings
Module 2 Learning Objectives10m
Weekly Reading30m
1 practice exercise
MDPs45m
Week
3

Week 3

3 hours to complete

Value Functions & Bellman Equations

3 hours to complete
9 videos (Total 56 min), 3 readings, 2 quizzes
9 videos
Value Functions6m
Rich Sutton and Andy Barto: A brief History of RL7m
Bellman Equation Derivation6m
Why Bellman Equations?5m
Optimal Policies7m
Optimal Value Functions5m
Using Optimal Value Functions to Get Optimal Policies8m
Week 3 Summary4m
3 readings
Module 3 Learning Objectives10m
Weekly Reading30m
Chapter Summary13m
2 practice exercises
[Practice] Value Functions and Bellman Equations45m
Value Functions and Bellman Equations45m
Week
4

Week 4

4 hours to complete

Dynamic Programming

4 hours to complete
10 videos (Total 72 min), 3 readings, 2 quizzes
10 videos
Iterative Policy Evaluation8m
Policy Improvement4m
Policy Iteration8m
Flexibility of the Policy Iteration Framework4m
Efficiency of Dynamic Programming5m
Warren Powell: Approximate Dynamic Programming for Fleet Management (Short)7m
Warren Powell: Approximate Dynamic Programming for Fleet Management (Long)21m
Week 4 Summary2m
Congratulations!3m
3 readings
Module 4 Learning Objectives10m
Weekly Reading30m
Chapter Summary30m
1 practice exercise
Dynamic Programming45m

Reviews

TOP REVIEWS FROM FUNDAMENTALS OF REINFORCEMENT LEARNING

View all reviews

About the Reinforcement Learning Specialization

Reinforcement Learning

Frequently Asked Questions

More questions? Visit the Learner Help Center.