In this class, you will learn fundamental algorithms and mathematical models for processing natural language, and how these can be used to solve practical problems.Preview Lectures
This course covers a broad range of topics in natural language processing, including word and sentence tokenization, text classification and sentiment analysis, spelling correction, information extraction, parsing, meaning extraction, and question answering, We will also introduce the underlying theory from probability, statistics, and machine learning that are crucial for the field, and cover fundamental algorithms like n-gram language modeling, naive bayes and maxent classifiers, sequence models like Hidden Markov Models, probabilistic dependency and constituent parsing, and vector-space models of meaning.We are offering this course on Natural Language Processing free and online to students worldwide, continuing Stanford's exciting forays into large scale online instruction. Students have access to screencast lecture videos, are given quiz questions, assignments and exams, receive regular feedback on progress, and can participate in a discussion forum. Those who successfully complete the course will receive a statement of accomplishment. Taught by Professors Jurafsky and Manning, the curriculum draws from Stanford's courses in Natural Language Processing. You will need a decent internet connection for accessing course materials, but should be able to watch the videos on your smartphone.
The following topics will be covered in the first two weeks:
No background in natural language processing is required. Students will be expected to know a bit of basic probability (know Bayes rule), a bit about vectors and vector spaces (could length normalize a vector), a bit of calculus (know that the derivative of a function is zero at a maximum or minimum of a function), but we will review these concepts as we first use them. You should have reasonable programming ability (know about hash tables and graph data structures), be able to write programs in Java or Python, and have a computer (Windows, Mac or Linux) with internet access.
We will provide detailed lecture notes of all the technical content, which will be yours to keep after the end of class. Many students do fine just working from the lectures and notes. But others find it very useful to have an accompanying textbook, for reinforcing the core material, as a source of additional exercises, and as a reference for the future.To prepare for the class in advance, you may consider reading through some sections of the textbooks (Jurafsky and Martin, Speech and Language Processing 2nd Edition, and Manning, Schütze and Raghavan 2008). Or, if you're rusty or not very experienced in either Java or Python, it'd be great to work through early parts of Bird, Klein and Loper 2009
Yes. Students who successfully complete the class will receive a statement of accomplishment signed by the instructor.
The class will consist of lecture videos, which are broken into small chunks, usually between 8 and 12 minutes each. Some of these may contain integrated quiz questions. There will also be standalone quizzes that are not part of video lectures, and programming assignments.
You need to work about 10 hours a week to complete the course.
Why Study Natural Language Processing?Natural language processing is the technology for dealing with our most ubiquitous product: human language, as it appears in emails, web pages, tweets, product descriptions, newspaper stories, social media, and scientific articles, in thousands of languages and varieties. In the past decade, successful natural language processing applications have become part of our everyday experience, from spelling and grammar correction in word processors to machine translation on the web, from email spam detection to automatic question answering, from detecting people's opinions about products or services to extracting appointments from your email. In this class, you'll learn the fundamental algorithms and mathematical models for human language processing and how you can use them to solve practical problems in dealing with language data wherever you encounter it.