This course provides a comprehensive guide to mastering data engineering, where you'll learn to build robust data pipelines, delve into ETL (Extract, Transform, Load) processes, and handle large datasets using Hadoop. You will gain expertise in extracting data from various sources, transforming it into a usable format, and loading it into data warehouses or big data platforms. With hands-on experience in Hadoop, the industry-standard framework for handling massive datasets, you’ll learn to manage and process massive datasets efficiently. Whether you're a beginner or an experienced professional, this course equips you with the skills to design, implement, and manage data pipelines, making you a valuable asset in any data-focused organization.



Data Engineering: Pipelines, ETL, Hadoop
This course is part of Building Smarter Data Pipelines: SQL, Spark, Kafka & GenAI Specialization


Instructors: Soheil Haddadi
Access provided by University of Catania
Recommended experience
What you'll learn
- Analyse the architecture and components of data pipelines to understand their impact on data flow and processing efficiency. 
- Implement robust ETL processes, for scalability and maintainability. 
- Analyze big data challenges and introduce Hadoop ecosystem tools (HDFS, MapReduce, Hive, Pig, and Spark) for data processing tasks. 
Skills you'll gain
Details to know

Add to your LinkedIn profile
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There is 1 module in this course
This course provides a comprehensive guide to mastering data engineering, where you'll learn to build robust data pipelines, delve into ETL (Extract, Transform, Load) processes, and handle large datasets using Hadoop. You will gain expertise in extracting data from various sources, transforming it into a usable format, and loading it into data warehouses or big data platforms.
What's included
12 videos4 readings4 assignments1 discussion prompt3 plugins
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Offered by
Why people choose Coursera for their career




Explore more from Data Science
 - Coursera Instructor Network 
 - Johns Hopkins University 
 - Duke University 
¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.


