UB
A great course to help understand the various wonderful options Google Cloud has to offer to move on-premise Hadoop workload to Google Cloud Platform to leverage scalability of clusters.
In this intermediate course, you will learn to design, build, and optimize robust batch data pipelines on Google Cloud. Moving beyond fundamental data handling, you will explore large-scale data transformations and efficient workflow orchestration, essential for timely business intelligence and critical reporting.
Get hands-on practice using Dataflow for Apache Beam and Serverless for Apache Spark (Dataproc Serverless) for implementation, and tackle crucial considerations for data quality, monitoring, and alerting to ensure pipeline reliability and operational excellence. A basic knowledge of data warehousing, ETL/ELT, SQL, Python, and Google Cloud concepts is recommended.
UB
A great course to help understand the various wonderful options Google Cloud has to offer to move on-premise Hadoop workload to Google Cloud Platform to leverage scalability of clusters.
MP
Great course teaching how to build batch pipelines through GCP technologies, and showing cool tools for data wrangling and analysis
SP
There were too many labs with services that take 30-40 minutes just to spin up. I wouldn't have a problem with all the labs if the services took 2-5 minutes to spin up.
IY
This course includes new services not much mentioned in the previous course. But, proportion of the module is not balanced.
SV
Excellent course with appropriate explanation on cloud data fusion, data composer, data proc and cloud data-flow. Must learn course for all aspiring Big Data Engineers.
RR
More teaching should be focused on how to build the python file of each task, rather than ready for us to run.
ER
Some parts of the course where not explained in full detail, especially some qwuick labs where questions were not tested or even provided with answers
HS
The pipeline building portion assumes in part that the learner has previous experience with programming. Further break down of the Python pipeline builds would be helpful.
AS
Informative on various features. But cloud fusion and dataflow are not very clearly explained in detail.. expecting more on this. Want to learn more on the pipeline topic please.
AS
The practical hands on lab can be further improved by making them more interactive rather then just copy pasting the code. Can be re-designed to have more room for experimentation.
AD
Great course learning what it is the big advantages of using GCP for data given they have big implementations and with better performance of what it is today in on premises scenarios
EL
This course really teaches me in-depth about data engineering than the cloud or any other products offered by GCP which is the most important part.
Showing: 20 of 212
More teaching should be focused on how to build the python file of each task, rather than ready for us to run.
Google did not work very hard to convey this information in its lectures. They are just bullet slides with a talking head. They need to learn how better course developers are creating courses.
There were some minor problem and mistake in the lab file. The python/java scripts were not explained at all. There are questions about the code itself, but then the questions were not answered.
This is a course on how to buy Google products, that's it. They explain the very basics of batch data pipelines but it's so intertwined with their own products you barely learn anything about data pipelines themselves. On top of that, exercises consist in pasting things they don't even care to explain.
Quite hard to follow
The last lab has an error so this course can never be finished and the certificate never be obtained. Further more some teachers use ununderstandable slang, and the transcripts are unreadable due to misspellings and transcription errors. The screenshots on the videos are blurred so the tears spring to your eyes when you try to read them.
Last few videos in Week2 have too much python, the only relief is that we have templates to save time. I have to spend more time in understanding the python scripts used in the last 4-5 videos. I have worked with python only for wrangling data using pandas, only grep.py & grepc.py were easy, all others were deep for me to understand. Maybe the course has to reiterate the importance of python programming. What if I want to build it custom, templates may not help. So, please stress the importance of python/java skills in the course
Course was great. Easy to understand and has many labs to try. One issue was that the first Dataflow lab was not working due to the Apache Beam issue. I worked with a rep and he said he would follow up with me after resolving the issue. But he never contacted me again. Probably the issue was nor resolved. So I never completed that lab.
I'm not from a programming (Java/.NET/Python etc) background but Informatica ETL, Oracle PL/SQL and UNIX. I wish the code as part of Serverless Data Analysis with Dataflow: Side Inputs (Python) explained step by step like where it starts and then step by step. If it sounds redundant, a document reference to each step would help.
Great addition to the Data Engineering line up. In addition to the updated content on newer tools like Cloud Composer and Data Fusion, I feel like the detail on tools like Dataproc and Dataflow is better than it used to be.
questions given in the quiz are pretty simple.. need those questions which can ensure learners to brainstorm
Informative and interactive course introducing to DataProc, DataFlow, Apache Beam, Apache Airflow.
The practical hands on lab can be further improved by making them more interactive rather then just copy pasting the code. Can be re-designed to have more room for experimentation.
There were too many labs with services that take 30-40 minutes just to spin up. I wouldn't have a problem with all the labs if the services took 2-5 minutes to spin up.
The Labs need to be tested
Its obvious some of the narators don't even understand the subjects they are talking about because they can't pronounce the words correctly. Come on. Let's at least read the scripts a few times before getting in front the camera. In this whole course sofar there's only two narators who obviously have played around with this technology. The rest just read the scripts which is so boring to listen to. I can read the scripts myself. Just smiling doesn't cut it. You need to understand what you are talking about so you accent things properly and sound like an expert.
Most of the labs need to be review to make sure the instructions are still correct. I spent a substanial part of my time in the labs finding work arounds because of poor instructions.
a lot to digest on this course... pace is too high in my opinion, it should be slower. good course anyways.
thanks
Dataflow Labs should be explained in greater detail so as to provide comprehensive learning
Some labs could be better in term of the thinking rationale, error handling, etc.