Why write programs when the computer can instead learn them from data? In this class you will learn how to make this happen, from the simplest machine learning algorithms to quite sophisticated ones. Enjoy!

Preview LecturesMachine learning algorithms can figure out how to perform important tasks by generalizing from examples. This is often feasible and cost-effective when manual programming is not. Machine learning (also known as data mining, pattern recognition and predictive analytics) is used widely in business, industry, science and government, and there is a great shortage of experts in it. If you pick up a machine learning textbook you may find it forbiddingly mathematical, but in this class you will learn that the key ideas and algorithms are in fact quite intuitive. And powerful!

Most of the class will be devoted to supervised learning (in other words, learning in which a teacher provides the learner with the correct answers at training time). This is the most mature and widely used type of machine learning. We will cover the main supervised learning techniques, including decision trees, rules, instances, Bayesian techniques, neural networks, model ensembles, and support vector machines. We will also touch on learning theory with an emphasis on its practical uses. Finally, we will cover the two main classes of unsupervised learning methods: clustering and dimensionality reduction. Throughout the class there will be an emphasis not just on individual algorithms but on ideas that cut across them and tips for making them work.

In the class projects you will build your own implementations of machine learning algorithms and apply them to problems like spam filtering, clickstream mining, recommender systems, and computational biology. This will get you as close to becoming a machine learning expert as you can in ten weeks!

Week One: Basic concepts in machine learning.

Week Two: Decision tree induction.

Week Three: Learning sets of rules and logic programs.

Week Four: Instance-based learning.

Week Five: Statistical learning.

Week Six: Neural networks.

Week Seven: Model ensembles.

Week Eight: Learning theory.

Week Nine: Support vector machines.

Week Ten: Clustering and dimensionality reduction.

The main prerequisite for this class is basic knowledge of programming. Some previous exposure to probability, statistics, linear algebra, calculus and/or logic is useful but not essential.

The class is self-contained, but a good complement to it is the book *The Master Algorithm*, by Pedro Domingos, published by Basic Books. For a more technical treatment, the textbook *Machine Learning*, by Tom Mitchell, published by McGraw-Hill, covers most but not all of the topics in the class. The remaining topics can be found in *Pattern Classification* (second edition), by Duda, Hart and Stork (Wiley), and other textbooks.

The class will consist of a series of lecture videos, typically 5 to 15 minutes in length. Each video contains a few integrated quiz questions. There will also be standalone homeworks that are not part of video lectures, programming assignments, and a final exam.

**Will I get a certificate after completing this class?**

**What resources will I need for this class?**

You will need access to a computer with a compiler/environment for the programming language of your choice.

**What is the coolest thing I'll learn if I take this class?**

Machine learning is the scientific method on steroids. It follows the same process of generating, testing, and discarding or refining hypotheses. But, while a scientist may spend her whole life coming up with and testing a few hundred hypotheses, a machine learning system can do the same in a fraction of a second.