Data Engineering, Big Data, and Machine Learning on GCP
Sponsored by Google
About this Specialization
This online specialization provides participants a hands-on introduction to designing and building data pipelines on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and derive insights. The course covers structured, unstructured, and streaming data.
This course teaches the following skills:
• Design and build data pipelines on Google Cloud Platform
• Lift and shift your existing Hadoop workloads to the Cloud using Cloud Dataproc.
• Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
• Manage your data Pipelines with Data Fusion and Cloud Composer.
• Derive business insights from extremely large datasets using Google BigQuery
• Learn how to use pre-built ML APIs on unstructured data and build different kinds of ML models using BigQuery ML.
• Enable instant insights from streaming data
This class is intended for developers who are responsible for:
• Extracting, Loading, Transforming, cleaning, and validating data
• Designing pipelines and architectures for data processing
• Integrating analytics and machine learning capabilities into data pipelines
• Querying datasets, visualizing query results and creating reports
>>> By enrolling in this specialization you agree to the Qwiklabs Terms of Service as set out in the FAQ and located at: https://qwiklabs.com/terms_of_service <<<
This 2-week accelerated on-demand course introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities.
At the end of this course, participants will be able to:
• Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform
• Use CloudSQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform
• Employ BigQuery and Cloud Datalab to carry out interactive data analysis
• Choose between Cloud SQL, BigTable and Datastore
• Train and use a neural network using TensorFlow
• Choose between different data processing products on the Google Cloud Platform
Before enrolling in this course, participants should have roughly one (1) year of experience with one or more of the following:
• A common query language such as SQL
• Extract, transform, load activities
• Data modeling
• Machine learning and/or statistics
• Programming in Python
Google Account Notes:
• Google services are currently unavailable in China....
The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud Platform in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. Learners will get hands-on experience with data lakes and warehouses on Google Cloud Platform using QwikLabs....
Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud Platform for data transformation including BigQuery, executing Spark on Cloud Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Cloud Dataflow. Learners will get hands-on experience building data pipeline components on Google Cloud Platform using QwikLabs....
*Note: this is a new course with updated content from what you may have seen in the previous version of this Specialization.
Processing streaming data is becoming increasingly popular as streaming enables businesses to get real-time metrics on business operations. This course covers how to build streaming data pipelines on Google Cloud Platform. Cloud Pub/Sub is described for handling incoming streaming data. The course also covers how to apply aggregations and transformations to streaming data using Cloud Dataflow, and how to store processed records to BigQuery or Cloud Bigtable for analysis. Learners will get hands-on experience building streaming data pipeline components on Google Cloud Platform using QwikLabs....