AI inference is the process of using a trained machine learning model to make predictions on new, unseen data by applying learned patterns. This course is designed for developers, data scientists, and ML engineers interested in quickly deploying AI inference services on Cloud Run. It is useful for those familiar with cloud-based serverless application deployment solutions, but who may not have experience with running AI inference using Google Cloud serverless products. The course includes examples that deploys a model for AI inference with GPUs and integrates gen AI apps with data storage services.

Deploy and Scale AI Models with Cloud Run

Deploy and Scale AI Models with Cloud Run
This course is part of Build and Modernize Applications With Generative AI Specialization

Instructor: Google Cloud Training
Access provided by Martin Luther Christian University
What you'll learn
Use Cloud Run GPUs for AI inference.
Deploy lightweight language models on Cloud Run for AI inference.
Optimize model deployments on Cloud Run for performance and cost efficiency.
Integrate Cloud Run AI inference services with database services on Google Cloud.
Skills you'll gain
Tools you'll learn
Details to know

Add to your LinkedIn profile
2 assignments
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 2 modules in this course
An introduction to Cloud Run and its capabilities.
What's included
1 assignment2 plugins
Deploy gen AI apps to Cloud Run for AI inference using machine learning models.
What's included
1 assignment5 plugins
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor

Offered by
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.






