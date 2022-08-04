About this Course

Flexible deadlines
Reset deadlines in accordance to your schedule.
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Coursera Labs
Includes hands on learning projects.
Learn more about Coursera Labs
Intermediate Level

Introductory computer science and data structures. Familiarity with the Python. Familiarity with basic probability and optimization.

Approx. 44 hours to complete
English

What you will learn

  • Map between qualitative preferences and appropriate quantitative utilities.

  • Model non-associative and associative sequential decision problems with multi-armed bandit problems and Markov decision processes respectively

  • Implement dynamic programming algorithms to find optimal policies

  • Implement basic reinforcement learning algorithms using Monte Carlo and temporal difference methods

Skills you will gain

  • Deep Learning
  • Markov Decision Process
  • Machine Learning
  • Reinforcement Learning
  • Monte Carlo Method
Learn more about Coursera Labs
Intermediate Level

Introductory computer science and data structures. Familiarity with the Python. Familiarity with basic probability and optimization.

Approx. 44 hours to complete
English

Columbia University

Syllabus - What you will learn from this course

Week 1
6 hours to complete

Decision Making and Utility Theory

Week 2
4 hours to complete

Bandit Problems

Week 3
4 hours to complete

Markov Decision Processes

Week 4
8 hours to complete

Dynamic Programming

6 videos (Total 42 min), 1 reading, 3 quizzes

