When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 4 modules in this course
Learn to orchestrate AI systems across local and cloud environments through hands-on infrastructure setup, model deployment, and workflow integration. You will build a prompt engineering pyramid from basic prompts to chain-of-thought reasoning implemented in Rust, then evaluate six decision factors for choosing between local and cloud models including latency, throughput, cost, and privacy. The course covers local AI infrastructure in depth: running Ollama with custom Modelfiles for task-specific assistants, deploying llamafile for zero-dependency portable inference, compiling Rust Candle with CUDA for GPU-accelerated local inference, and optimizing local RAG with caching strategies. You will configure a complete AI workstation with tmux for session management, nvidia-smi and Zenith for GPU monitoring, and NVIDIA GPU optimization. The final module covers cloud workflows including AWS Spot instances for cost-effective GPU compute, Hugging Face model discovery and download, and GitHub AI models integration. By completing this course, you will be able to set up local AI infrastructure, deploy models across local and cloud environments, and design orchestration workflows that balance cost, privacy, and performance.
A comprehensive course covering prompt engineering with chain-of-thought reasoning, local inference runtimes (Ollama, llamafile, Candle), GPU workstation configuration, and cost-optimized cloud deployment with AWS Spot instances.
What's included
7 videos2 readings1 assignment
Show info about module content
7 videos•Total 29 minutes
Course intro•2 minutes
Course overview•2 minutes
AI orchestration overview•8 minutes
Prompt engineering pyramid•3 minutes
Chain of thought prompt Rust•4 minutes
Chain of thought Rust prompt demo•6 minutes
Explaining chain of thought Rust prompt•3 minutes
2 readings•Total 2 minutes
Key Terms: Course•1 minute
Key Terms: Prompt Engineering Pyramid•1 minute
1 assignment•Total 5 minutes
Orchestration Fundamentals•5 minutes
Local AI Infrastructure
Module 2•1 hour to complete
Module details
Covers local vs cloud model tradeoffs, caching strategies, local RAG optimization, Ollama with custom Modelfiles, llamafile portable deployment, and Candle GPU-accelerated Rust inference.
What's included
9 videos3 readings1 assignment
Show info about module content
9 videos•Total 41 minutes
Ollama local demo•7 minutes
Ollama Modelfile Rust debugger•7 minutes
Ollama arch•2 minutes
Local vs. cloud models•5 minutes
Caching for AI•4 minutes
Optimizing local RAG•5 minutes
Llamafile getting started Gemma•4 minutes
Llamafile simple•3 minutes
Compiling Rust candle GPU•5 minutes
3 readings•Total 30 minutes
Key Terms: Ollama Local Demo•10 minutes
Key Terms: Local vs. Cloud Models•10 minutes
Key Terms: Llamafile: Getting Started with Gemma•10 minutes
1 assignment•Total 5 minutes
Quiz: Local AI Infrastructure•5 minutes
Workstation and Cloud Workflows
Module 3•2 hours to complete
Module details
Covers tmux session management, nvidia-smi and Zenith GPU monitoring, local workstation orchestration, AWS Spot instance deployment, Hugging Face and GitHub AI model workflows, and Rust project structure.
What's included
11 videos3 readings1 assignment
Show info about module content
11 videos•Total 47 minutes
AWS spot deploy ML•4 minutes
Hugging Face workflow models•3 minutes
GitHub AI models workflow•3 minutes
Rust Hello World project structure•2 minutes
Using tmux on Linux•11 minutes
Using NVIDIA SMI•5 minutes
Using Zenith GPU monitoring•5 minutes
AI orchestration local workstation•5 minutes
Technical training approaches•5 minutes
Effective AI engineering learning•3 minutes
Course conclusion•3 minutes
3 readings•Total 30 minutes
Key Terms: AWS Spot Deploy ML•10 minutes
Key Terms: Using tmux on Linux•10 minutes
Key Terms: Technical Training Approaches•10 minutes
1 assignment•Total 30 minutes
Quiz: Workstation and Cloud Workflows•30 minutes
Capstone
Module 4•1 hour to complete
Module details
Head-to-head comparison of Ollama vs `apr` ([paiml/aprender](https://github.com/paiml/aprender)) running Qwen2.5-Coder-1.5B on the same prompt suite, same hardware. Build a chain-of-thought routing engine that selects runtimes based on task complexity and validation requirements, with cost analysis spanning local workstations, Spot instances, and Bedrock.
What's included
4 readings1 assignment
Show info about module content
4 readings•Total 31 minutes
Key Terms: Capstone: AI Orchestration in Practice•10 minutes
Capstone Project•10 minutes
Next Steps•10 minutes
Before You Go•1 minute
1 assignment•Total 30 minutes
Final Graded Quiz•30 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.