This course covers the core techniques used in data mining, including frequent pattern analysis, classification, clustering, outlier analysis, as well as mining complex data and research frontiers in the data mining field.
This course is part of the Data Mining Foundations and Practice Specialization
Offered By
About this Course
data science professionals or domain experts, some experience working with data, successful completion of Data Mining Pipeline
What you will learn
Identify the core functionalities of data modeling in the data mining pipeline
Apply techniques that can be used to accomplish the core functionalities of data modeling and explain how they work.
Evaluate data modeling techniques, determine which is most suitable for a particular task, and identify potential improvements.
Skills you will gain
- outlier analysis
- clustering
- classification
- model evaluation
- frequent pattern analysis
data science professionals or domain experts, some experience working with data, successful completion of Data Mining Pipeline
Offered by

University of Colorado Boulder
CU-Boulder is a dynamic community of scholars and learners on one of the most spectacular college campuses in the country. As one of 34 U.S. public institutions in the prestigious Association of American Universities (AAU), we have a proud tradition of academic excellence, with five Nobel laureates and more than 50 members of prestigious academic academies.
Start working towards your Master's degree
Syllabus - What you will learn from this course
Frequent Pattern Analysis
This module starts with an overview of data mining methods, then focuses on frequent pattern analysis, including the Apriori algorithm and FP-growth algorithm for frequent itemset mining, as well as association rules and correlation analysis.
Classification
This module introduces supervised learning, classification, prediction, and covers several core classification methods including decision tree induction, Bayesian classification, support vector machines, neural networks, and ensemble methods. It also discusses classification model evaluation and comparison.
Clustering
This module introduces unsupervised learning, clustering, and covers several core clustering methods including partitioning, hierarchical, grid-based, density-based, and probabilistic clustering. Advanced topics for high-dimensional clustering, bi-clustering, graph clustering, and constraint-based clustering are also discussed.
Outlier Analysis
This module discusses three different types of outliers (global, contextual, and collective) and how different methods may be used to identify and analyze such outliers. It also covers some advanced methods for mining complex data, as well as the research frontiers of the data mining field.
About the Data Mining Foundations and Practice Specialization
The Data Mining specialization is intended for data science professionals and domain experts who want to learn the fundamental concepts and core techniques for discovering patterns in large-scale data sets. This specialization consists of three courses: (1) Data Mining Pipeline, which introduces the key steps of data understanding, data preprocessing, data warehouse, data modeling and interpretation/evaluation; (2) Data Mining Methods, which covers core techniques for frequent pattern analysis, classification, clustering, and outlier detection; and (3) Data Mining Project, which offers guidance and hands-on experience of designing and implementing a real-world data mining project.

Frequently Asked Questions
When will I have access to the lectures and assignments?
What will I get if I subscribe to this Specialization?
Is financial aid available?
More questions? Visit the Learner Help Center.