Coursera

LLM Optimization & Evaluation Specialization

Coursera

LLM Optimization & Evaluation Specialization

Optimize & Deploy Production-Ready LLM Systems.

Build expertise in LLM evaluation, optimization, and deployment through hands-on MLOps projects.

John Whitworth
LearningMate

Instructors: John Whitworth

Access provided by DBS Bank

Get in-depth knowledge of a subject
Intermediate level

Recommended experience

4 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Get in-depth knowledge of a subject
Intermediate level

Recommended experience

4 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Evaluate and optimize LLM performance using statistical testing, MLOps tools, and production monitoring systems.

  • Build automated pipelines for feature engineering, experiment tracking, and data processing with industry-standard tools.

  • Diagnose LLM errors, implement safety frameworks, and reduce operational costs through systematic analysis.

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English
Recently updated!

December 2025

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Advance your subject-matter expertise

  • Learn in-demand skills from university and industry experts
  • Master a subject or tool with hands-on projects
  • Develop a deep understanding of key concepts
  • Earn a career certificate from Coursera

Specialization - 13 course series

Engineer Features and Evaluate Models for Production

Engineer Features and Evaluate Models for Production

Course 1, 3 hours

What you'll learn

  • Build feature engineering pipelines and evaluate ML experiments using MLOps tools to select and deploy production-ready models.

Skills you'll gain

Category: Feature Engineering
Category: Analysis
Category: Model Evaluation
Category: Model Training
Category: MLOps (Machine Learning Operations)
Category: Data Preprocessing
Category: Model Optimization
Category: Data Pipelines
Category: Performance Analysis
Category: Data Transformation
Category: Model Deployment
Category: Machine Learning Methods
Category: Technical Writing
Optimize Deep Learning: Tune PyTorch Models

Optimize Deep Learning: Tune PyTorch Models

Course 2, 4 hours

What you'll learn

  • Use PyTorch Lightning to implement callbacks, diagnose instabilities, and optimize model performance.

Skills you'll gain

Category: Deep Learning
Category: Performance Tuning
Category: Debugging
Category: PyTorch (Machine Learning Library)
Category: Scalability
Category: MLOps (Machine Learning Operations)
Category: Model Deployment
Category: Fine-tuning
Category: Model Training
Category: Artificial Neural Networks
Category: Model Optimization
Category: Transfer Learning
Evaluate & Optimize LLM Performance

Evaluate & Optimize LLM Performance

Course 3, 4 hours

What you'll learn

  • Evaluate LLMs using metrics like BLEU & ROUGE run A/B tests for statistical significance, and optimize model performance with data-driven strategies.

Skills you'll gain

Category: Statistical Methods
Category: Model Evaluation
Category: Statistical Analysis
Category: Test Script Development
Category: Data-Driven Decision-Making
Category: Natural Language Processing
Category: Large Language Modeling
Category: Probability & Statistics
Category: Statistical Hypothesis Testing
Category: LLM Application
Category: Embeddings
Category: Scripting
Category: Model Optimization
Category: Prompt Engineering
Category: Performance Metric
Category: Statistical Inference
Analyze Logs: Fix LLM Hallucinations

Analyze Logs: Fix LLM Hallucinations

Course 4, 4 hours

What you'll learn

  • Use data analysis to diagnose LLM hallucinations by correlating user behavior and system errors, and document findings to guide engineering fixes.

Skills you'll gain

Category: Root Cause Analysis
Category: Data Analysis
Category: Business Metrics
Category: Large Language Modeling
Category: Debugging
Category: Artificial Intelligence
Category: Generative AI
Category: Correlation Analysis
Category: Retrieval-Augmented Generation
Category: Analysis
Category: Technical Communication
Category: Data Manipulation
Category: Pandas (Python Package)
Category: LLM Application
Evaluate LLMs: Test and Prove Significance

Evaluate LLMs: Test and Prove Significance

Course 5, 3 hours

What you'll learn

  • Rigorously evaluate LLM performance using statistical tests and confidence intervals to make data-driven deployment decisions.

Skills you'll gain

Category: Model Evaluation
Category: Data Presentation
Category: Data-Driven Decision-Making
Category: Model Deployment
Category: Scientific Visualization
Category: Performance Metric
Category: Statistical Methods
Category: Statistical Hypothesis Testing
Category: Statistical Programming
Category: Matplotlib
Category: Statistical Analysis
Category: Statistical Visualization
Category: Data Storytelling
Category: Statistical Software
Category: Statistics
Category: Statistical Inference
Category: Experimentation
Category: Large Language Modeling
Optimize SQL: Build Fast Data Pipelines

Optimize SQL: Build Fast Data Pipelines

Course 6, 3 hours

What you'll learn

  • Parameterized SQL with CTEs and window functions builds scalable, maintainable pipelines that adapt as business needs change.

  • Query optimization is systematic: analyze execution plans, find costly steps, then resolve them with indexing or rewrites.

  • Materialized summary tables and well-timed processing, like morning refreshes, support reliable analytics infrastructure.

  • Understanding execution internals helps analysts build self-sufficient workflows without recurring engineering delays.

Skills you'll gain

Category: SQL
Category: Performance Tuning
Category: Data Transformation
Category: Data Pipelines
Category: Scripting
Category: Extract, Transform, Load
Category: Database Management
Category: Query Languages
Category: Data Manipulation
Safeguard LLM Outputs: Test and Evaluate

Safeguard LLM Outputs: Test and Evaluate

Course 7, 3 hours

What you'll learn

  • Build and validate a robust safety testing framework for LLMs. Create behavioral test suites and use mutation testing to ensure their effectiveness.

Skills you'll gain

Category: Security Testing
Category: Verification And Validation
Category: Prompt Patterns
Category: Model Evaluation
Category: Code Coverage
Category: Threat Modeling
Category: Large Language Modeling
Category: Test Script Development
Category: Quality Assessment
Category: AI Security
Category: Maintainability
Category: Test Case
Category: Responsible AI
Category: Test Tools
Category: Unit Testing
Category: Testability
Category: Prompt Engineering
Category: Software Technical Review
Category: Software Testing
Category: LLM Application
Track and Evaluate ML Model Experiments

Track and Evaluate ML Model Experiments

Course 8, 3 hours

What you'll learn

  • Track, version, and evaluate ML experiments using DVC and W&B to reliably select and prepare models for production deployment.

Skills you'll gain

Category: Version Control
Category: MLOps (Machine Learning Operations)
Category: Model Evaluation
Category: Model Training
Category: Performance Analysis
Category: Dashboard
Category: Predictive Modeling
Category: Data Management
Category: Large Language Modeling
Category: Model Deployment
Category: Machine Learning
Category: Record Keeping
Category: Interactive Data Visualization
Automate Cloud Workflows with Python Scripting

Automate Cloud Workflows with Python Scripting

Course 9, 1 hour

What you'll learn

  • Create automated Python scripts to manage multi-step cloud workflows, from provisioning resources to persisting data.

Skills you'll gain

Category: Scripting
Category: Python Programming
Category: Virtual Machines
Category: Command-Line Interface
Category: Infrastructure as Code (IaC)
Category: Data Persistence
Category: Data Pipelines
Category: AI Workflows
Automate Data Pipelines: Schema Evolution

Automate Data Pipelines: Schema Evolution

Course 10, 2 hours

What you'll learn

  • Build automated data pipelines with Apache Airflow, manage schema evolution to prevent failures, and implement monitoring for data integrity.

Skills you'll gain

Category: Data Pipelines
Category: Data Integrity
Category: Apache Airflow
Category: Extract, Transform, Load
Category: System Monitoring
Category: Data Modeling
Category: Data Quality
Category: Data Transformation
Category: Continuous Monitoring
Category: Data Validation
Develop and Evaluate LLM Features Effectively

Develop and Evaluate LLM Features Effectively

Course 11, 3 hours

What you'll learn

  • Translate an LLM product concept into a detailed PRD and create a UAT plan to validate that the delivered feature meets user requirements.

Skills you'll gain

Category: Verification And Validation
Category: User Acceptance Testing (UAT)
Category: Scenario Testing
Category: User Requirements Documents
Category: Key Performance Indicators (KPIs)
Category: Acceptance Testing
Category: Test Planning
Category: AI Product Strategy
Category: Prioritization
Category: Functional Testing
Category: Risk Management Framework
Category: LLM Application
Category: Requirements Analysis
Category: User Story
Category: Product Requirements
Category: Business Requirements
Category: Functional Requirement
Category: Large Language Modeling
Document and Evaluate LLM Prompting Success

Document and Evaluate LLM Prompting Success

Course 12, 2 hours

What you'll learn

  • Create operational run-books for LLM systems and evaluate prompt patterns to improve performance and reduce operational costs.

Skills you'll gain

Category: Prompt Engineering
Category: Prompt Patterns
Category: Data Maintenance
Category: MLOps (Machine Learning Operations)
Category: Configuration Management
Category: Benchmarking
Category: Performance Tuning
Category: Retrieval-Augmented Generation
Category: Technical Writing
Category: Model Optimization
Category: Large Language Modeling
Category: Requirements Analysis
Category: Token Optimization
Category: Technical Documentation
Category: Performance Testing
Category: LLM Application
Optimize LLM Costs & Streamline Processes

Optimize LLM Costs & Streamline Processes

Course 13, 2 hours

What you'll learn

  • Optimize LLM costs by analyzing spend reports and streamline ML pipelines using value-stream mapping to boost efficiency and reduce cycle times.

Skills you'll gain

Category: Process Improvement and Optimization
Category: Waste Minimization
Category: Model Optimization
Category: Cost Management
Category: Proposal Development
Category: Process Optimization
Category: Business Workflow Analysis
Category: Data-Driven Decision-Making
Category: Collaborative Software
Category: Process Analysis
Category: Operating Cost
Category: AI Workflows
Category: Productivity Software
Category: LLM Application
Category: Miro AI
Category: Lean Manufacturing
Category: Process Modeling

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

John Whitworth
Coursera
30 Courses2,541 learners
LearningMate
275 Courses21,878 learners

Offered by

Coursera

Why people choose Coursera for their career

Felipe M.

Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."