Generative AI models, like large language models, often exceed the capabilities of consumer-grade hardware and are expensive to run. Compressing models through methods such as quantization makes them more efficient, faster, and accessible. This allows them to run on a wide variety of devices, including smartphones, personal computers, and edge devices, and minimizes performance degradation.



Quantization Fundamentals with Hugging Face


Instructors: Younes Belkada
Access provided by Special Competitive Studies Project
(13 reviews)
Recommended experience
What you'll learn
Learn how to compress models with the Hugging Face Transformers library and the Quanto library.
Learn about linear quantization, a simple yet effective method for compressing models.
Practice quantizing open source multimodal and language models.
Skills you'll practice
Details to know
Only available on desktop
See how employees at top companies are mastering in-demand skills

Learn, practice, and apply job-ready skills in less than 2 hours
- Receive training from industry experts
- Gain hands-on experience solving real-world job tasks

About this project
Instructors


Offered by
How you'll learn
Hands-on, project-based learning
Practice new skills by completing job-related tasks with step-by-step instructions.
No downloads or installation required
Access the tools and resources you need in a cloud environment.
Available only on desktop
This project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices.
Why people choose Coursera for their career




You might also like
DeepLearning.AI
DeepLearning.AI
Coursera Instructor Network
DeepLearning.AI