About this Course

44,617 recent views
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Course 2 of 6 in the
Flexible deadlines
Reset deadlines in accordance to your schedule.
Intermediate Level
Approx. 7 hours to complete
English

Skills you will gain

Data ScienceArtificial Intelligence (AI)Machine LearningBig DataSpark
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Course 2 of 6 in the
Flexible deadlines
Reset deadlines in accordance to your schedule.
Intermediate Level
Approx. 7 hours to complete
English

Offered by

Placeholder

IBM

Syllabus - What you will learn from this course

Content RatingThumbs Up75%(1,098 ratings)Info
Week
1

Week 1

2 hours to complete

Week 1: Introduction

2 hours to complete
6 videos (Total 44 min), 6 readings, 2 quizzes
6 videos
What is Big Data?11m
Data storage solutions5m
Parallel data processing strategies of Apache Spark7m
Functional programming basics6m
Resilient Distributed Dataset and DataFrames - ApacheSparkSQL6m
6 readings
Course Syllabus10m
Setup of the grading and exercise environment10m
Exercise 1 - working with RDD10m
Exercise 2 - functional programming basics with RDDs10m
Exercise 3 - working with DataFrames10m
Programming Lanuage Options for Apache Spark (optional)10m
2 practice exercises
Practice Quiz (Ungraded) - Apache Spark concepts30m
Apache Spark and parallel data processing
Week
2

Week 2

2 hours to complete

Week 2: Scaling Math for Statistics on Apache Spark

2 hours to complete
8 videos (Total 52 min), 3 readings, 4 quizzes
8 videos
Standard deviation3m
Skewness3m
Kurtosis2m
Covariance, Covariance matrices, correlation13m
Plotting with ApacheSpark and python's matplotlib12m
Dimensionality reduction4m
PCA5m
3 readings
Exercise 1 - statistics and transfomrations using DataFrames10m
Exercise on Plotting10m
Exercise on PCA10m
4 practice exercises
Practice Quiz (Ungraded) - Statistics and API usage on Spark30m
Parallelism in Apache Spark 
Questions on Plotting
Questions on PCA
Week
3

Week 3

1 hour to complete

Week 3: Introduction to Apache SparkML

1 hour to complete
5 videos (Total 34 min), 2 readings, 3 quizzes
5 videos
Introduction to SparkML20m
Extract - Transform - Load3m
Introduction to Clustering: k-Means3m
Using K-Means in Apache SparkML2m
2 readings
Exercise 1: Modifying a Apache SparkML Feature Engineering Pipeline10m
Exercise 2 - Working with Clustering and Apache SparkML10m
3 practice exercises
Practice Quiz (Ungraded) - ML Pipelines30m
SparkML concepts 
Practice Quiz (Ungraded) - SparkML Algorithms
Week
4

Week 4

1 hour to complete

Week 4: Supervised and Unsupervised learning with SparkML

1 hour to complete
4 videos (Total 18 min), 2 readings, 2 quizzes
4 videos
LinearRegression with Apache SparkML6m
Logistic Regression1m
LogisticRegression with Apache SparkML4m
2 readings
Exercise 1 - Improving Classification performance10m
Course Project10m
2 practice exercises
Practice Quiz (Ungraded) - SparkML Algorithms (2)30m
Course Project Quiz

Reviews

TOP REVIEWS FROM SCALABLE MACHINE LEARNING ON BIG DATA USING APACHE SPARK

View all reviews

About the IBM AI Engineering Professional Certificate

IBM AI Engineering

Frequently Asked Questions

More questions? Visit the Learner Help Center.