This course provides a comprehensive guide to deploying, managing, and optimizing AI and high-performance computing (HPC) workloads on Google Cloud. Through a series of lessons and practical demonstrations, you’ll explore diverse deployment strategies, ranging from highly customizable environments using Google Compute Engine (GCE) to managed solutions like Google Kubernetes Engine (GKE). Specifically, you’ll learn how to create clusters and deploy GKE for inference.

AI Infrastructure: Deployment Types

AI Infrastructure: Deployment Types

Instructor: Google Cloud Training
Access provided by Masterflex LLC, Part of Avantor
What you'll learn
Describe the process of creating a GPU-accelerated cluster.
Identify how to provision a GPU-accelerated cluster on GCE.
Identify how to provision a GPU-accelerated cluster on GKE.
Identify how to deploy AI inference workloads on GKE.
Skills you'll gain
- System Configuration
- Cloud Deployment
- Performance Tuning
- Google Cloud Platform
- Model Deployment
- Cloud Infrastructure
- Containerization
- Infrastructure As A Service (IaaS)
- Network Planning And Design
- Distributed Computing
- Network Performance Management
- Kubernetes
- Cloud Engineering
- Application Deployment
- AI Orchestration
Details to know

Add to your LinkedIn profile
4 assignments
December 2025
See how employees at top companies are mastering in-demand skills

There are 6 modules in this course
This module offers an overview of the course and outlines the learning objectives.
What's included
1 plugin
This module details the AI Hypercomputer cluster creation process. It covers the key decisions required, including choosing a machine type, consumption option, deployment option, orchestrator, and cluster image.
What's included
1 assignment6 plugins
This module identifies key configuration options and optimization techniques for deploying an AI Hypercomputer cluster on Google Compute Engine (GCE). It covers selecting machine types, accelerator OS images, deployment options, and strategies for optimizing network performance.
What's included
1 assignment4 plugins
This module identifies configuration options for deploying an AI Hypercomputer cluster on Google Kubernetes Engine (GKE). It covers containerization, GKE modes of operation, networking configurations, and workload optimization techniques like distributed training and GPU sharing.
What's included
1 assignment4 plugins
This module examines optimization techniques for architecting an inference workload on GKE. It covers the GKE inference workflow, key infrastructure and model-level optimizations.
What's included
1 assignment4 plugins
Student PDF links to all modules
What's included
1 reading
Instructor

Offered by
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
Explore more from Computer Science

Google Cloud

Google Cloud

Google Cloud


