About this Course

47,641 recent views
Flexible deadlines
Reset deadlines in accordance to your schedule.
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Coursera Labs
Includes hands on learning projects.
Learn more about Coursera Labs External Link
Beginner Level

Computer and IT literacy.

Approx. 7 hours to complete
English

What you will learn

  • Explain how streaming data and Spark Structured Streaming empower machine learning and AI tasks.

  • Define graph theory, describe Apache Spark GraphFrames, and identify data suitable for GraphFrames.

  • Describe how ETL processes work with Apache Spark and machine learning and extend that knowledge to Spark MLlib capabilities and related benefits.

  • Explain supervised learning, unsupervised learning, and clustering, and explain how to use the k-means clustering algorithm with Spark MLlib.

Flexible deadlines
Reset deadlines in accordance to your schedule.
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Coursera Labs
Includes hands on learning projects.
Learn more about Coursera Labs External Link
Beginner Level

Computer and IT literacy.

Approx. 7 hours to complete
English

Offered by

Placeholder

IBM Skills Network

Syllabus - What you will learn from this course

Week1
Week 1
2 hours to complete

Spark for Data Engineering

2 hours to complete
4 videos (Total 25 min), 2 readings, 3 quizzes
Week2
Week 2
3 hours to complete

SparkML

3 hours to complete
3 videos (Total 13 min), 1 reading, 4 quizzes
Week3
Week 3
3 hours to complete

Final Project

3 hours to complete
3 readings

Reviews

TOP REVIEWS FROM DATA ENGINEERING AND MACHINE LEARNING USING SPARK

View all reviews

Frequently Asked Questions

More questions? Visit the Learner Help Center.