Back to Build Batch Data Pipelines on Google Cloud
Google Cloud

Build Batch Data Pipelines on Google Cloud

In this intermediate course, you will learn to design, build, and optimize robust batch data pipelines on Google Cloud. Moving beyond fundamental data handling, you will explore large-scale data transformations and efficient workflow orchestration, essential for timely business intelligence and critical reporting. Get hands-on practice using Dataflow for Apache Beam and Serverless for Apache Spark (Dataproc Serverless) for implementation, and tackle crucial considerations for data quality, monitoring, and alerting to ensure pipeline reliability and operational excellence. A basic knowledge of data warehousing, ETL/ELT, SQL, Python, and Google Cloud concepts is recommended.

Status: Data Validation
Status: Serverless Computing
IntermediateCourse11 hours

Featured reviews

UB

5.0Reviewed May 27, 2020

A great course to help understand the various wonderful options Google Cloud has to offer to move on-premise Hadoop workload to Google Cloud Platform to leverage scalability of clusters.

MP

5.0Reviewed May 19, 2020

Great course teaching how to build batch pipelines through GCP technologies, and showing cool tools for data wrangling and analysis

IY

4.0Reviewed May 13, 2020

This course includes new services not much mentioned in the previous course. But, proportion of the module is not balanced.

SV

5.0Reviewed Jun 18, 2020

Excellent course with appropriate explanation on cloud data fusion, data composer, data proc and cloud data-flow. Must learn course for all aspiring Big Data Engineers.

DR

4.0Reviewed May 19, 2020

takes time understand , video makes little bore but in practice to enjoy doing but try to mention required time for excuetion or waiting time to task to executeto ece

RR

4.0Reviewed Feb 11, 2020

More teaching should be focused on how to build the python file of each task, rather than ready for us to run.

AS

4.0Reviewed Apr 19, 2020

Good introduction to pipelines building in GCP, Starting labs need to be in more detail. Other than that very good course.

NG

5.0Reviewed Jun 13, 2020

Good course covering Dataproc, Dataflow, Dataprep and the labs ofcourse..great way to get introduced to batch data pipelines in GCP.

CU

4.0Reviewed Jul 18, 2020

Good, I think pipelines need to have more labs related to some necessities in the industry, such as connect them to other external sources outside GCP

YC

5.0Reviewed Dec 1, 2020

It would be great if there can be a walkthrough for the lab session to check if the answers for interpreting the code are correct.

AS

5.0Reviewed May 19, 2020

Informative on various features. But cloud fusion and dataflow are not very clearly explained in detail.. expecting more on this. Want to learn more on the pipeline topic please.

HS

4.0Reviewed Sep 12, 2020

The pipeline building portion assumes in part that the learner has previous experience with programming. Further break down of the Python pipeline builds would be helpful.

All reviews

Showing: 20 of 212

RLee
4.0
Reviewed Feb 12, 2020
Roger Smith, PhD, MBA
2.0
Reviewed Jan 24, 2020
Polla Tamás-Marosi
4.0
Reviewed Feb 2, 2020
Sergio Gutierrez
1.0
Reviewed Mar 31, 2022
Mahmoud Masmoudi
3.0
Reviewed Apr 5, 2020
Jaap Koning
1.0
Reviewed Apr 21, 2021
Xavier Anandaraj A
4.0
Reviewed May 6, 2020
Jeanmann Park
5.0
Reviewed Apr 26, 2020
Prashanth Talla
5.0
Reviewed Apr 24, 2020
James E White
5.0
Reviewed Apr 1, 2020
Rahul JYOTI Saha
5.0
Reviewed Apr 17, 2020
Léo Zawislak
5.0
Reviewed Feb 27, 2020
Arpnik Singh
4.0
Reviewed Apr 23, 2023
Scott Poulin
4.0
Reviewed Feb 6, 2020
Adolfo Camacho Yague
4.0
Reviewed May 2, 2020
Thomas Morris
3.0
Reviewed Jan 6, 2021
Franz Rinkleff
3.0
Reviewed Oct 5, 2022
Johnny Chaves
3.0
Reviewed May 3, 2020
Divyangana Pandey
3.0
Reviewed Apr 12, 2020
Hendra
3.0
Reviewed Apr 10, 2020