About this Course

14,808 recent views
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Intermediate Level

Completion of the first two courses in the Data Science with Databricks for Data Analysts Coursera specialization. This course is the final course.

Approx. 16 hours to complete
English

What you will learn

  • Explore data using unsupervised machine learning.

  • Solve complex supervised learning problems using tree-based models.

  • Apply hyperparameter tuning and cross-validation strategies to improve model performance.

Skills you will gain

Data ScienceMachine LearningDatabricks
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Intermediate Level

Completion of the first two courses in the Data Science with Databricks for Data Analysts Coursera specialization. This course is the final course.

Approx. 16 hours to complete
English

Offered by

Placeholder

Databricks

Syllabus - What you will learn from this course

Week
1

Week 1

2 hours to complete

Welcome to the Course

2 hours to complete
8 videos (Total 40 min), 2 readings, 1 quiz
8 videos
Review of Data Science4m
Review of Machine Learning5m
Data Science Process vs. Machine Learning Workflow3m
Introduction to Databricks (Optional)5m
Introduction to the Platform (Optional)7m
Introduction to Apache Spark (Optional)4m
Introduction to Delta Lake (Optional)6m
2 readings
Before you begin10m
Hands-on with Databricks Lab (Optional)30m
1 practice exercise
Course Introduction and Prerequisites5m
Week
2

Week 2

3 hours to complete

Applied Unsupervised Learning

3 hours to complete
15 videos (Total 69 min), 2 readings, 6 quizzes
15 videos
Exploring Data4m
Visualizing Data11m
Introduction to K-means Clustering5m
Applied K-means Clustering7m
Identifying the Number of Clusters4m
Identifying the Number of Clusters Demo4m
Utilizing Clusters2m
Lesson Introduction1m
Feature Relationships4m
Correlation Matrix4m
Introduction to Principal Components Analysis3m
Applied Principal Components Analysis6m
PCA for Feature Relationships3m
PCA for Dimensionality Reduction3m
2 readings
K-means Clustering Lab30m
Principal Components Analysis Lab30m
6 practice exercises
Exploring and Visualizing Data5m
K-means Clustering5m
K-means Clustering Lab Results5m
Feature Correlation5m
Principal Components Analysis5m
PCA Lab Results10m
Week
3

Week 3

3 hours to complete

Feature Engineering and Selection

3 hours to complete
17 videos (Total 71 min), 2 readings, 6 quizzes
17 videos
Introduction to Feature Engineering3m
Common Feature Improvements3m
Handling Missing Values3m
Imputing Missing Values11m
Feature Scaling3m
Converting Feature Types3m
Representing Categorical Features2m
One-hot Encoding7m
Lesson Introduction2m
Problems with High Dimensions and Dimensionality Reduction3m
A Review of Feature Importance4m
Linear Regression Coefficients and P-values6m
Introduction to Feature Selection2m
Regularization1m
Regularized Regression3m
Applied Regularized Regression5m
2 readings
Feature Engineering Lab30m
Feature Selection Lab30m
6 practice exercises
Feature Engineering Concepts5m
Missing Values5m
Feature Engineering Lab Results20m
Dimensionality and Feature Importance5m
Feature Selection in Linear Regression5m
Feature Selection Lab Results10m
Week
4

Week 4

6 hours to complete

Applied Tree-based Models

6 hours to complete
14 videos (Total 60 min), 10 readings, 9 quizzes
14 videos
A Review of Decision Trees3m
Algorithm Selection2m
String Indexing Categorical Features8m
Decision Tree Pruning6m
Lesson Introduction32s
Introduction to Ensemble Modeling3m
Bootstrap Sampling Training Data2m
Applied Random Forest4m
Lesson Introduction56s
A Review of Classification Evaluation Metrics4m
A Review of Assigning Classes4m
Oversampling and Undersampling Classes4m
Weighting Classes in Random Forest11m
10 readings
Feature Engineering in Decision Trees5m
Preventing Overfitting5m
Applied Decision Trees Lab30m
Aggregating Bootstrapped Results7m
Random Forest Algorithm10m
Applied Random Forest Lab30m
Problems with Class Imbalance5m
Label-based Bootstrap Sampling4m
Label-based Evaluation Weighting6m
Label Imbalance Lab30m
9 practice exercises
Algorithm Selection and Decision Trees10m
Categorical Features5m
Applied Decision Trees Lab Results10m
Tree-based Ensemble Modeling30m
Bootstrap Aggregation30m
Applied Random Forest Lab Results5m
Classification Evaluation30m
Label Imbalance and Sampling30m
Label Imbalance Lab Results10m

Reviews

TOP REVIEWS FROM APPLIED DATA SCIENCE FOR DATA ANALYSTS

View all reviews

About the Data Science with Databricks for Data Analysts Specialization

Data Science with Databricks for Data Analysts

Frequently Asked Questions

More questions? Visit the Learner Help Center.