Real-time data is everywhere — from fraud detection in financial transactions to personalized recommendations in e-commerce and anomaly detection in IoT devices. Traditional batch processing is too slow for these use cases, and businesses need insights the moment data is generated. This course teaches you how to design, build, and operate reliable streaming pipelines using Apache Spark Structured Streaming and Kafka.

Process Real-Time Data with Spark Streams

Process Real-Time Data with Spark Streams
This course is part of Real-Time, Real Fast: Kafka & Spark for Data Engineers Specialization


Instructors: Caio Avelino
Access provided by ExxonMobil
Recommended experience
What you'll learn
Explain the execution model of Spark Structured Streaming and build a simple pipeline from a file source to a console sink.
Develop streaming pipelines that integrate with Kafka, apply event-time processing with watermarks, and write reliable outputs to Delta Lake.
Build an end-to-end Spark streaming pipeline that can be deployed in real-world production environments.
Skills you'll gain
Details to know

Add to your LinkedIn profile
January 2026
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 3 modules in this course
Learners are introduced to the Spark Structured Streaming model and its core concepts, including micro-batch execution, triggers, checkpoints, output modes and data transformation.
What's included
4 videos3 readings
This module focuses on integrating Spark with real-world streaming systems. Learners will consume data from Kafka, transform and parse messages, and write results to sinks such as Delta Lake, ensuring reliability with checkpointing and triggers
What's included
3 videos2 readings1 peer review
Learners design an end-to-end streaming pipeline that combines ingestion, transformation, enrichment with static datasets, and reliable output.
What's included
4 videos3 readings1 assignment1 peer review
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Offered by
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
Explore more from Data Science
¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.





