What is pipeline orchestration and recovery in this course?

It means designing a real-time data pipeline as a coordinated workflow that can schedule work, manage dependencies, and recover cleanly when something fails. The course focuses on making pipelines reliable over time, not just getting a script or job to run once.

When would you use this kind of workflow orchestration?

You would use it when a pipeline needs to run repeatedly, stay observable, and keep data moving even when tasks fail, records are bad, or a dependency becomes unstable. In this course, it is used for real-time and batch-adjacent workflows that need safe retries, replays, and recovery paths.

How does orchestration and recovery fit into a broader workflow?

It sits between writing the logic for individual pipeline steps and running the whole system reliably over time. In this course, that layer turns separate tasks into a repeatable process you can schedule, monitor, backfill, and restore.

How is an orchestrated, recoverable pipeline different from running separate jobs manually?

Manual jobs mainly rely on separate reruns and human judgment, while an orchestrated, recoverable pipeline has defined dependencies, retries, and recovery paths. The course emphasizes coordinated execution and controlled recovery rather than ad hoc fixes after something breaks.

Do you need any prerequisites before learning pipeline orchestration and recovery?

A basic understanding of Python, SQL, the Linux command line, and Kafka fundamentals is helpful before starting this course. Because it is intermediate, it assumes you can follow how tasks, state, and data movement behave in a real pipeline.

What tools, platforms, or methods are used in this course?

The course uses modern workflow orchestrators such as Airflow and Prefect, along with recovery methods like checkpointing and dead-letter queues.

What specific tasks will you practice or complete in this course?

You practice building scheduled workflows with dependencies and retries, and using logs or alerts to investigate failures. You also work on recovery tasks such as restarting from checkpoints, handling bad records safely, and running controlled backfills or failover steps.

Orchestrate & Recover Real-Time Data Pipelines

Ce cours n'est pas disponible en Français (France)

Nous sommes actuellement en train de le traduire dans plus de langues.

Orchestrate & Recover Real-Time Data Pipelines

Ce cours fait partie de Spécialisation "Real-Time, Real Fast: Kafka & Spark for Data Engineers"

Instructeurs : Starweaver

Inclus avec

3 modules

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

niveau Intermédiaire

Expérience recommandée

4 heures à compléter

Planning flexible

Apprenez à votre propre rythme

3 modules

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

niveau Intermédiaire

Expérience recommandée

4 heures à compléter

Planning flexible

Apprenez à votre propre rythme

Ce que vous apprendrez

Build and schedule streaming and batch-adjacent workflows using a modern orchestrator, such as Airflow or Prefect.
IImplement reliability patterns like idempotence, checkpointing, DLQs, and backfills for fault-tolerant and exactly-once-ish processing.
Design multi-region recovery strategies (mirroring/replication) and run playbooks to restore pipelines after partial or regional failures.

Compétences que vous acquerrez

Catégorie : Workflow Management
Catégorie : Dataflow
Catégorie : Data Infrastructure
Catégorie : Data Integrity
Catégorie : Data Processing
Catégorie : Real Time Data
Catégorie : Disaster Recovery
Catégorie : Data Pipelines
Catégorie : Site Reliability Engineering

Outils que vous découvrirez

Catégorie : Apache Kafka
Catégorie : Apache Spark
Catégorie : Apache Airflow

Détails à connaître

Certificat partageable

Ajouter à votre profil LinkedIn

Récemment mis à jour !

janvier 2026

Évaluations

1 devoir

Enseigné en Anglais

Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

En savoir plus sur Coursera pour les affaires

logos de Petrobras, TATA, Danone, Capgemini, P&G et L'Oreal

Élaborez votre expertise du sujet

Ce cours fait partie de la Spécialisation "Real-Time, Real Fast: Kafka & Spark for Data Engineers"

Lorsque vous vous inscrivez à ce cours, vous êtes également inscrit(e) à cette Spécialisation.

Apprenez de nouveaux concepts auprès d'experts du secteur
Acquérez une compréhension de base d'un sujet ou d'un outil
Développez des compétences professionnelles avec des projets pratiques
Obtenez un certificat professionnel partageable

Il y a 3 modules dans ce cours

Building a data pipeline is easy. Building one that automatically recovers from failures, maintains data integrity during outages, and runs reliably in production—that's what separates junior engineers from platform architects.

This course teaches you to design self-healing pipelines with automated recovery, fault tolerance, and disaster recovery built in from day one. You'll learn to build and schedule streaming workflows using modern orchestrators like Airflow and Prefect, implement reliability patterns including idempotence, checkpointing, and dead-letter queues for exactly-once-ish processing, and design multi-region recovery strategies that keep data flowing during regional failures. Through hands-on labs and real-world examples from Airbnb, LinkedIn, Netflix, and Uber, you'll master the orchestration and recovery techniques that turn fragile scripts into production-grade infrastructure. Learn to handle automated retries, run safe backfills, implement checkpoint-based recovery, and execute disaster recovery playbooks that restore pipelines after outages. Engineers who build or maintain real-time data pipelines and need stronger orchestration, reliability, and recovery skills. Basics of Python & SQL, Linux CLI, and Kafka fundamentals. Cloud account helpful but optional. By the end of the course, learners will be able to design, orchestrate, and recover real-time data pipelines that run reliably at production scale.

Détails du module

Learners set up a modern orchestrator and build a first DAG/flow that runs reliably. We cover scheduling, retries, task dependencies, and lightweight observability. By the end, learners will ship a minimal but production-aware pipeline.

Inclus

4 vidéos2 lectures1 évaluation par les pairs

4 vidéosTotal 31 minutes

Why Orchestration Matters: From Cron to DAGs3 minutes
Build Your First DAG (Airflow)9 minutes
Flows the Pythonic Way (Prefect)9 minutes
Demo: Scheduling, Retries, and Alerting End-to-End10 minutes

2 lecturesTotal 10 minutes

Welcome to the Course: Course Overview5 minutes
Choosing an Orchestrator: Airflow vs. Prefect5 minutes

1 évaluation par les pairsTotal 20 minutes

Hands-On-Learning: Ship a Minimal Reliable DAG/Flow20 minutes

We move from “works on my machine” to “recovers on its own.” Learners add exactly-once-ish processing, checkpointing, schema controls, and dead-letter queues. The module emphasizes designing for replay and safe backfills.

Inclus

3 vidéos1 lecture1 évaluation par les pairs

3 vidéosTotal 32 minutes

Exactly-Once with Kafka: What You Really Get14 minutes
Checkpointing & State: Replaying Without Duplicates8 minutes
DLQs in Practice: From Error Handling to Triaging10 minutes

1 lectureTotal 5 minutes

Checkpoints & WAL in Structured Streaming5 minutes

1 évaluation par les pairsTotal 20 minutes

Hands-On-Learning: Make a Stream Bulletproof: Checkpoints, DLQ, Idempotence20 minutes

Learners design for failure domains—task, job, cluster, and region. We cover backfills vs. reprocessing, Delta time travel for safe fixes, and Kafka replication patterns (MirrorMaker 2, uReplicator) for DR.

Inclus

4 vidéos2 lectures1 devoir2 évaluations par les pairs

4 vidéosTotal 34 minutes

Backfills & Reprocessing Without Breaking SLAs10 minutes
Time Travel & Audits with Delta Tables8 minutes
Cross-Region Kafka Replication (MM2/uReplicator)11 minutes
Your Recovery Posture, Summarized4 minutes

2 lecturesTotal 10 minutes

Choosing a Replication Strategy: MM2 vs. uReplicator5 minutes
Additional Resource5 minutes

1 devoirTotal 20 minutes

Orchestrate & Recover Real-Time Data Pipelines20 minutes

2 évaluations par les pairsTotal 80 minutes

Hands-On-Learning: DR Fire Drill: Cross-Region Failover & Targeted Backfill20 minutes
Project: Orchestrate & Recover a Real-Time Pipeline60 minutes

Obtenez un certificat professionnel

Ajoutez ce titre à votre profil LinkedIn, à votre curriculum vitae ou à votre CV. Partagez-le sur les médias sociaux et dans votre évaluation des performances.