Machine Learning with Apache Spark

This course is part of multiple programs.

Taught in English

Some content may not be translated

IBM Skills Network Team
Ramesh Sannareddy

Instructors: IBM Skills Network Team

6,846 already enrolled

Included with Coursera Plus


Gain insight into a topic and learn the fundamentals


(56 reviews)

Intermediate level

Recommended experience

14 hours (approximately)
Flexible schedule
Learn at your own pace

What you'll learn

  • Describe ML, explain its role in data engineering, summarize generative AI, discuss Spark's uses, and analyze ML pipelines and model persistence.

  • Evaluate ML models, distinguish between regression, classification, and clustering models, and compare data engineering pipelines with ML pipelines.

  • Construct the data analysis processes using Spark SQL, and perform regression, classification, and clustering using SparkML.

  • Demonstrate connecting to Spark clusters, build ML pipelines, perform feature extraction and transformation, and model persistence.

Details to know

Shareable certificate

Add to your LinkedIn profile


7 assignments


Gain insight into a topic and learn the fundamentals


(56 reviews)

Intermediate level

Recommended experience

14 hours (approximately)
Flexible schedule
Learn at your own pace

See how employees at top companies are mastering in-demand skills


Build your subject-matter expertise

This course is available as part of
When you enroll in this course, you'll also be asked to select a specific program.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review


There are 4 modules in this course

In this module, you will gain knowledge of machine learning techniques that enable computers to perform tasks without explicit programming. You will explore the lifecycle of machine learning models and understand the crucial role of data engineering in machine learning projects. The module covers supervised and unsupervised learning techniques, including classification, regression, and clustering. Furthermore, you will acquire valuable insights into Generative AI and its potential to revolutionize multiple industries, enhance people's lives, and generate newer and previously unimaginable data and experiences.

What's included

11 videos4 readings2 assignments5 app items

This module will introduce you to Spark and provide an overview of its key features and applications in the field of data engineering. You will discover the process of connecting to a Spark cluster using SN labs and delve into various topics such as regression, mileage prediction, classification, diabetic classification, clustering, and clustering load data using SparkML. Additionally, you will gain insights into how to construct these models using Spark ML. Moreover, this module will cover GraphFrames on Apache Spark and guide you in hands-on labs.

What's included

5 videos2 readings2 assignments4 app items

This module begins with Apache Spark Structured Streaming and its role in processing streaming data with Spark SQL. You will acquire knowledge about key terms associated with Structured Streaming. The module then covers the Extract-Transform-Load process and provides hands-on experience in transferring data from one source to another destination with varying data formats or structures. Additionally, you will gain a practical understanding of feature extraction and transformation using Spark extract and transform features. The module also delves into machine learning pipelines using Spark, demonstrating the process and benefits involved. Lastly, you will grasp the concept of model persistence and its significant role in Machine Learning.

What's included

6 videos2 readings2 assignments5 app items1 plugin

In this module, you will apply the data engineering skills and techniques you have acquired throughout the course. The course concludes with a final project and assignments that allow you to demonstrate your proficiency in these areas. You will step into the role of a data engineer working at a renowned aeronautics consulting company recognized for its adeptness in handling large datasets. Your role as a data engineer is crucial as the data scientists rely on your expertise to carry out ETL (Extract, Transform, Load) tasks and establish machine learning pipelines. While data scientists possess expertise in machine learning, they depend on your specialized knowledge to handle various algorithms and data formats. Your contribution plays a vital role in ensuring the smooth execution of their tasks.

What's included

4 readings1 assignment2 app items


Instructor ratings
4.8 (19 ratings)
IBM Skills Network Team
55 Courses753,364 learners

Offered by


Recommended if you're interested in Machine Learning

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

Showing 3 of 56


56 reviews

  • 5 stars


  • 4 stars


  • 3 stars


  • 2 stars


  • 1 star



Reviewed on Feb 2, 2024


Reviewed on Mar 14, 2024


Reviewed on Mar 22, 2024

New to Machine Learning? Start here.


Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions