Chevron Left
Back to ETL and Data Pipelines with Shell, Airflow and Kafka

Learner Reviews & Feedback for ETL and Data Pipelines with Shell, Airflow and Kafka by IBM

4.5
stars
432 ratings

About the Course

Delve into the two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. In this course, you will learn about the different tools and techniques that are used with ETL and Data pipelines. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for loading data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. By the end of this course, you will also know how to use Apache Airflow to build data pipelines as well be knowledgeable about the advantages of using this approach. You will also learn how to use Apache Kafka to build streaming pipelines as well as the core components of Kafka which include: brokers, topics, partitions, replications, producers, and consumers. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module....

Top reviews

JJ

Jul 22, 2023

Labs in this course are very helpful and to the point. It took me a while to complete this course but i learned a lot.

SK

Jan 20, 2025

Relevant information in recordings, good recap of every video and hand-on lesson in the end to concrete the knowledge.

Filter by:

1 - 25 of 93 Reviews for ETL and Data Pipelines with Shell, Airflow and Kafka

By Chris B

•

Apr 20, 2022

Course content is good but labs are riddled with bugs and in dire need of quality control. I encountered many time-consuming, frustrating technical issues that made completing this course a slog. Final assignment introduces some difficult linux manipulations that were not covered in the coures and are not really that relevant to the subject matter. Some questions on the final are unclear and could be better written. Would recommend the instructors or whomever created this course to eat their own cooking and go through this course and fix the various issues.

By Dmitry K

•

Sep 17, 2021

Buggy practice. Not possible to complete without fixing airflow start script yourself. Nobody monitor or fixing issues here

By Benjamin A A

•

Aug 20, 2022

I cannot proceed with the "SUBMIT a DAG" lab as I am constantly being shown the error - "cp: cannot create regular file '/home/project/airflow/dags/my_first_dag.py': Permission denied" when I run the command - "cp my_first_dag.py $AIRFLOW_HOME/dags".

How are you expecting me to complee this lab when I am getting a permission denied error. Please fix this asap.

By Tal M

•

Jul 17, 2022

The course is really basic, it only introduces the keywords and very high level concepts of ETL. Barely discusses any technical challenges or constraints. Some of the questions in the quizzes are absurd.

By Nataliya S

•

Oct 12, 2021

Thanks to IBMOpens in a new tab and CourseraOpens in a new tab for the great "ETL and Data Pipelines with Shell, Airflow and Kafka" course, that I passed with Grade Achieved: 100%. It's the third course, that I've passed, as a part of "IBM Data Engineering Specialization". I was so carried away by the course that I literally sat up until 2 am almost every day. In this course I could apply my knowledge of Python, Pandas, SQL, Bash commands to build ETL Batch and Stream pipelines.

By RLee

•

Jan 13, 2022

The final project to connect Airflow as a pipeline management tool to Kafka server is a very useful hands-on project. More details or explanations on the syntax of Python calling Kafka producer and consumer, which are in the files of toll_traffic_generator.py and streaming_data_reader.py, would be more valuable rather than just providing these two files to run on its own.

By Evgeny D

•

Sep 29, 2021

It's one of the most challenging courses I've been enrolled!

By Santiago Z A

•

Sep 15, 2022

REALLY A GOOD COURSE BUT:

- Labs are not debugged (inaccuracies)

- I understand that Kafka a wide technology and maybe it will take more than a week to cover in a appropiate way, but the labs were only about copy and paste commands.

By bengisu p

•

Aug 17, 2023

I can't understand some of the questions in quizzes. Moreover, the peer-to-peer grading system should be converted to automatic grading.

By Ilya K

•

Jan 13, 2022

Perfect environment to make experiments! Very easy and powerful in use.

By Omar H

•

Jan 26, 2022

It's great introduction for airflow and kafka but still an introduction it is shallow doesn't offer much but at the end you will understand what you need to continue further in both technologies.

By YANGYANG C

•

Jan 17, 2022

Love the labs, but do not like the robotic lectures.

By BO W

•

Jul 8, 2022

final quiz sucks!

why are you so sick to make up this quiz ?

this quiz is pretty much more like GMAT reading test instead of IT assessment !

By Harald M

•

Sep 29, 2024

This is a well-crafted course about ETL and building/streaming Data Pipelines. The hands-on labs experience including practicing shell scripting prepared well for the final assignment task. Writing the real-world-scenario DAG tasks to create the ETL Data Pipelines using Apache Airflow was challenging. Successfully submitting the DAG and monitoring it in the UI DAGs list was at the same time satisfactory.

By Natale F

•

Dec 15, 2021

Interesting course with enough labs.

By Hugo A O O

•

Dec 6, 2021

i really liked the labs

By Chris W

•

Apr 3, 2022

A decent overview of Airflow and Kafka. Worth it for the time invested. The labs were good, however the execution of the final assignment was poor -- you have to submit two dozen screen captures for a peer reviewed assignment. Taking screen caps of code is silly, why not just submit the code? Plus you are taking the caps before you even know if your code works. And you are relying on strangers to read and understand your code before you can get credit for the course. Fortunately, some kind soul found mine quickly and gave me 100%. My code did work -- I tested it thoroughly -- but you can't really tell from screen caps.

By Sina S S

•

May 7, 2022

A good introductory course to airflow and kafka. Could have been broken up into at least two courses focusing on each of these platform, and going more in depth in each one. Also, the final assignment is a pain to complete especially due to some errors in instructions. But overall, It is a decent course.

By Warwick S

•

Oct 13, 2023

A good overview and introduction to using Airflow and Kafka. The quizzes are lazily written and ask specific rather than generalisable knowledge questions. The final assignment for Airflow was great - lots of coding and debugging. Kafka not so much - just paste commands and watch it run.

By Katarzyna G

•

Mar 26, 2022

It would be much better with real instructors and with no peer review that is not objecitve and no proper ansers clue

By Mimi Z

•

Oct 28, 2022

The course material was basic so make sure do to a lot of your own additional learning outside of the coureswork. The discussion staff are not helpful/don't understand or even read your questions before replying. The labs don't always work and the instructions don't always line up with current software upgrades. Just be prepared to do a lot of troubleshooting with not much help. I wish the course would tell you what to do when certain errors occur/are more thorough with their instructions.

By Roberta B

•

Apr 3, 2022

Ok, Very good course, but during the exam the focus was a very difficult part made of commands of Linux Shell, expecially dealing with files that are not CSV. That was not the main focus of the course, actually.....

By Aleksandra

•

Dec 10, 2023

The labs in the module lacked proper planning. Connecting to servers consumed excessive time, and errors meant reconnecting, often without success. The instructions provided by instructors were vague, suggesting solutions like 'try using other networks,' which prolonged the process. Sadly, this meant spending a month solely on server connections instead of delving into the ETL process. There was a significant amount of time wasted. Moreover, the lectures could benefit from a more contemporary approach beyond a mere slideshow. Additionally, the lecturer's voice was somewhat grating; it felt almost artificial, prompting the question of whether it was actually a human reading the slides. It might be worth considering this aspect for future presentations.

By Steven W

•

Jul 19, 2023

I feel though the final project suffered from issues with permissions, and there was a lack of a standard setup. Where should DAG scripts go? Why should they be in a folder with admin only permissions? Submitting screenshots is tedious and (frankly) shows a lack of willingness on the part of the course designers to use tools like nbgrader/Jupyter notebooks or other automated grading solutions.

Warning, if you can write a "Hello World" program in any language, you probably want to skip this course/certification.

By Trevor F K

•

Feb 16, 2024

As with all these IBM courses this one is super boring. Robot voice talking over powerpoints, as usual. This one stuck out as especially bad because the online lab environment is very unreliable. So much time was wasted waiting for airflow to fail to start. Extremely frustrating!