University of Alberta
Fundamentals of Reinforcement Learning
University of Alberta

Fundamentals of Reinforcement Learning

Martha White
Adam White

Instructors: Martha White

92,476 already enrolled

Included with Coursera Plus

Gain insight into a topic and learn the fundamentals.
4.8

(2,772 reviews)

Intermediate level

Recommended experience

Flexible schedule
Approx. 15 hours
Learn at your own pace
92%
Most learners liked this course
Gain insight into a topic and learn the fundamentals.
4.8

(2,772 reviews)

Intermediate level

Recommended experience

Flexible schedule
Approx. 15 hours
Learn at your own pace
92%
Most learners liked this course

What you'll learn

  • Formalize problems as Markov Decision Processes

  • Understand basic exploration methods and the exploration / exploitation tradeoff

  • Understand value functions, as a general-purpose tool for optimal decision-making

  • Know how to implement dynamic programming as an efficient solution approach to an industrial control problem

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

5 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Placeholder

Build your subject-matter expertise

This course is part of the Reinforcement Learning Specialization
When you enroll in this course, you'll also be enrolled in this Specialization.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate
Placeholder
Placeholder

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Placeholder

There are 5 modules in this course

Welcome to: Fundamentals of Reinforcement Learning, the first course in a four-part specialization on Reinforcement Learning brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, get a flavour of what the course has in store for you, and be given an in-depth roadmap to help make your journey through this specialization as smooth as possible.

What's included

4 videos2 readings1 discussion prompt

For the first week of this course, you will learn how to understand the exploration-exploitation trade-off in sequential decision-making, implement incremental algorithms for estimating action-values, and compare the strengths and weaknesses to different algorithms for exploration. For this week’s graded assessment, you will implement and test an epsilon-greedy agent.

What's included

8 videos3 readings1 assignment1 programming assignment1 discussion prompt2 plugins

When you’re presented with a problem in industry, the first and most important step is to translate that problem into a Markov Decision Process (MDP). The quality of your solution depends heavily on how well you do this translation. This week, you will learn the definition of MDPs, you will understand goal-directed behavior and how this can be obtained from maximizing scalar rewards, and you will also understand the difference between episodic and continuing tasks. For this week’s graded assessment, you will create three example tasks of your own that fit into the MDP framework.

What's included

7 videos2 readings1 assignment1 peer review1 discussion prompt

Once the problem is formulated as an MDP, finding the optimal policy is more efficient when using value functions. This week, you will learn the definition of policies and value functions, as well as Bellman equations, which is the key technology that all of our algorithms will use.

What's included

9 videos3 readings2 assignments1 discussion prompt

This week, you will learn how to compute value functions and optimal policies, assuming you have the MDP model. You will implement dynamic programming to compute value functions and optimal policies and understand the utility of dynamic programming for industrial applications and problems. Further, you will learn about Generalized Policy Iteration as a common template for constructing algorithms that maximize reward. For this week’s graded assessment, you will implement an efficient dynamic programming agent in a simulated industrial control problem.

What's included

10 videos3 readings1 assignment1 programming assignment1 discussion prompt

Instructors

Instructor ratings
4.7 (796 ratings)
Martha White
University of Alberta
4 Courses97,777 learners
Adam White
University of Alberta
4 Courses97,777 learners

Offered by

Recommended if you're interested in Machine Learning

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

Showing 3 of 2772

4.8

2,772 reviews

  • 5 stars

    81.73%

  • 4 stars

    14.51%

  • 3 stars

    2.55%

  • 2 stars

    0.43%

  • 1 star

    0.75%

AM
5

Reviewed on Jul 1, 2021

MN
5

Reviewed on Apr 11, 2024

AB
5

Reviewed on Sep 6, 2019

New to Machine Learning? Start here.

Placeholder

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions