When will I receive my Course Certificate?

If you complete the course successfully, your electronic Course Certificate will be added to your Accomplishments page - from there, you can print your Course Certificate or add it to your LinkedIn profile.

Why can’t I audit this course?

This course is currently available only to learners who have paid or received financial aid, when available.

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

LLM Engineer’s Handbook

Ends soon! Save on skills that make you shine with 40% off 3 months of Coursera Plus. Save now

LLM Engineer’s Handbook

Instructor: Packt - Course Instructors

Included with

Learn more

11 modules

Gain insight into a topic and learn the fundamentals.

Beginner level

Recommended experience

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

11 modules

Gain insight into a topic and learn the fundamentals.

Beginner level

Recommended experience

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

What you'll learn

Design and manage effective LLM training and deployment pipelines.
Implement supervised fine-tuning and evaluate LLM performance.
Deploy scalable, end-to-end LLM applications using cloud tools.

Skills you'll gain

Tools you'll learn

Model Deployment

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

11 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 11 modules in this course

In this comprehensive course, you will explore the intricate world of Large Language Models (LLMs) and gain the skills to design, train, and deploy them using cutting-edge MLOps practices. LLMs are revolutionizing the AI landscape, and understanding how to develop and manage them is essential for AI professionals.

This course is designed to help you not only grasp the core concepts behind LLMs but also give you hands-on experience to build production-grade LLM systems. You'll learn how to create scalable, efficient LLM systems from scratch, focusing on real-world applications that will make you stand out in the AI industry. What sets this course apart is its combination of in-depth theoretical insights and real-world, practical applications. You'll move beyond basic knowledge to master LLM architecture, supervised fine-tuning, and deployment on cloud platforms, ensuring that you’re fully equipped to build robust, production-ready systems. This course is ideal for AI engineers, NLP professionals, and anyone looking to deepen their expertise in LLM engineering. A basic understanding of LLMs, Python, and cloud platforms like AWS is recommended for optimal learning.

In this section, we delve into the concept and architecture of LLM Twin, an innovative AI model mimicking a person's writing style and personality. We discuss its significance, benefits over generic chatbots, and the planning process for creating an effective LLM product. Detailed insights into the design of the feature, training, and inference pipelines are explored to structure a robust ML system.

What's included

2 videos3 readings1 assignment

In this section, we introduce the essential tools needed for the course, particularly for the LLM Twin project. We provide an overview of the tech stack, cover installation procedures for Python and its ecosystem, dependency management with Poetry, and task execution using Poe the Poet. This section also provides insights into MLOps and LLMOps tooling, including ZenML and Hugging Face, and explains their roles in the project. Finally, we guide users in setting up an AWS account, focusing on SageMaker for deploying ML models.

What's included

1 video2 readings1 assignment

In this section, we delve into the LLM Twin project by designing a data collection pipeline for gathering raw data essential for LLM use cases, such as fine-tuning and inference. We'll focus on implementing an ETL pipeline that aggregates data from platforms like Medium and GitHub into a MongoDB data warehouse, thus simulating real-world machine learning project scenarios.

What's included

1 video4 readings1 assignment

In this section, we explore the Retrieval-augmented Generation (RAG) feature pipeline, a crucial technique for embedding custom data into large language models without constant fine-tuning. We introduce the fundamental components of a naive RAG system, such as chunking, embedding, and vector databases. We also delve into LLM Twin's RAG feature pipeline architecture, applying theoretical concepts through practical implementation, and discuss the importance of RAG for addressing issues like model hallucinations and old data. This section provides in-depth insights into advanced RAG techniques and the role of batch pipelines in syncing data for improved accuracy.

What's included

1 video7 readings1 assignment

1 videoTotal 1 minute

RAG Feature Pipeline - Overview Video1 minute

7 readingsTotal 170 minutes

Introduction10 minutes
What are Embeddings?30 minutes
DB Operations10 minutes
Exploring the LLM Twin’s RAG Feature Pipeline Architecture30 minutes
Change data capture: syncing the data warehouse and feature store30 minutes
Querying the Data Warehouse30 minutes
OVM30 minutes

1 assignmentTotal 10 minutes

Advanced Concepts in Retrieval-Augmented Generation (RAG)10 minutes

In this section, we will explore the process of Supervised Fine-Tuning (SFT) for Large Language Models (LLMs). We'll delve into the creation of instruction datasets and how they are used to refine LLMs for specific tasks. This section covers the steps involved in crafting these datasets, the importance of data quality, and presents various techniques and strategies for enhancing the fine-tuning process. Our focus will be on transforming general-purpose models into specialized assistants through SFT, enabling them to provide more coherent and relevant responses.

What's included

1 video7 readings1 assignment

1 videoTotal 1 minute

Supervised Fine-Tuning - Overview Video1 minute

7 readingsTotal 150 minutes

Introduction10 minutes
Data Deduplication30 minutes
Data Generation10 minutes
Creating Our Own Instruction Dataset30 minutes
Exploring SFT and its Techniques30 minutes
Training Parameters10 minutes
Fine-tuning in Practice30 minutes

1 assignmentTotal 10 minutes

Advanced Techniques in Language Model Fine-Tuning10 minutes

In this section, we delve into the realms of preference alignment, discussing how Direct Preference Optimization (DPO) can fine-tune language models to better align with human preferences. We elaborate on creating and evaluating preference datasets, ensuring our models capture nuanced human interactions.

What's included

1 video4 readings1 assignment

In this section, we delve into the evaluation of large language models (LLMs), addressing various evaluation methods and their significance. We cover general-purpose, domain-specific, and task-specific evaluations, highlighting the unique challenges each presents. Additionally, we explore retrieval-augmented generation (RAG) pipelines and introduce tools like Ragas and ARES for comprehensive LLM assessment.

What's included

1 video3 readings1 assignment

In this section, we dive into the art of fine-tuning large language models to boost their performance and efficiency. We'll explore key strategies to optimize the inference process of these models, a crucial step given their heavy computational and memory demands. From reducing latency to improving throughput and minimizing memory usage, we examine how to deploy specialized hardware and innovative techniques to enhance model output. By learning these optimization secrets, you'll unlock more efficient deployments, be they for fast-response tasks like code completion or document generation in batches.

What's included

1 video3 readings1 assignment

In this section, we explore the construction and implementation of a RAG inference pipeline, starting from understanding its architecture to implementing key modules such as retrieval, prompt creation, and interaction with the LLM. We introduce methods for optimizing retrieval processes like query expansion and self-querying while utilizing OpenAI's API, and integrate these techniques into a comprehensive retrieval module. We'll conclude by assembling these elements into a cohesive inference pipeline and preparing for further deployment steps.

What's included

1 video5 readings1 assignment

1 videoTotal 1 minute

RAG Inference Pipeline - Overview Video1 minute

5 readingsTotal 130 minutes

Introduction30 minutes
Self-querying30 minutes
Advanced RAG Post-retrieval Optimization: Reranking10 minutes
Implementing the LLM Twin's RAG Inference Pipeline30 minutes
Bringing Everything Together into the RAG Inference Pipeline30 minutes

1 assignmentTotal 10 minutes

Advanced RAG Pipeline Implementation10 minutes

In this section, we focus on deploying the inference pipeline for large language models (LLMs) in ML applications, ensuring models are accessible and efficient for end users. We'll cover deployment strategies, architectural decisions, and optimization techniques to address challenges like computing power and feature access.

What's included

1 video5 readings1 assignment

1 videoTotal 1 minute

Inference Pipeline Deployment - Overview Video1 minute

5 readingsTotal 110 minutes

Introduction10 minutes
Monolithic versus Microservices Architecture in Model Serving10 minutes
Exploring the LLM Twin’s Inference Pipeline Deployment Strategy30 minutes
Deploying the LLM Twin model to AWS SageMaker30 minutes
Calling the AWS SageMaker Inference Endpoint30 minutes

1 assignmentTotal 10 minutes

Modern ML Model Deployment10 minutes

In this section, we dive into the intricacies of MLOps and LLMOps, exploring their roles in automating machine learning processes and handling large language models. We will cover their origins in DevOps, highlight the unique challenges LLMOps addresses, such as prompt management and scaling issues, and illustrate the practical steps for deploying these systems efficiently. The section also includes discussions on the transition from manual deployment to cloud-based solutions, emphasizing the advantages of CI/CD pipelines and Dockerization in executing and managing models at scale.

What's included

1 video7 readings1 assignment

1 videoTotal 1 minute

MLOps and LLMOps - Overview Video1 minute

7 readingsTotal 210 minutes

Introduction30 minutes
MLOps Principles30 minutes
Prompt Monitoring30 minutes
Setting up the ZenML Cloud30 minutes
Run the Pipelines on AWS30 minutes
GitHub Actions CI YAML File30 minutes
Trigger downstream pipelines30 minutes

1 assignmentTotal 10 minutes

MLOps and LLMOps Fundamentals10 minutes

Instructor

Packt - Course Instructors

Packt

1,864 Courses516,130 learners

Offered by

Packt

Explore more from Data Analysis

Packt
Essential Guide to LLMOps
Course
Status: Free Trial
Coursera
Build, Analyze, and Refactor LLM Workflows
Course
Status: Free Trial
Coursera
LLM Engineering That Works: Prompting, Tuning, and Retrieval
Professional Certificate
Packt
From Recipe to Chef - Become an LLM Engineer
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Frequently asked questions

Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.

If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.

Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.

LLM Engineer’s Handbook

LLM Engineer’s Handbook

What you'll learn

Skills you'll gain

Tools you'll learn

Details to know

See how employees at top companies are mastering in-demand skills

There are 11 modules in this course

Understanding the LLM Twin Concept and Architecture

What's included

Tooling and Installation

What's included

Data Engineering

What's included

RAG Feature Pipeline

What's included

Supervised Fine-Tuning

What's included

Fine-Tuning with Preference Alignment

What's included

Evaluating LLMs

What's included

Inference Optimization

What's included

RAG Inference Pipeline

What's included

Inference Pipeline Deployment

What's included

MLOps and LLMOps

What's included

Instructor

Offered by

Explore more from Data Analysis

Essential Guide to LLMOps

Build, Analyze, and Refactor LLM Workflows

LLM Engineering That Works: Prompting, Tuning, and Retrieval

From Recipe to Chef - Become an LLM Engineer

Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.

Save on skills that make you shine with 40% off 3 months of Coursera Plus

Drive your business forward and empower your teams

Frequently asked questions

Can I preview a course before enrolling?

When will I have access to the lectures and assignments?

What will I get when I enroll?

More questions