Generative AI models, like large language models, often exceed the capabilities of consumer-grade hardware and are expensive to run. Compressing models through methods such as quantization makes them more efficient, faster, and accessible. This allows them to run on a wide variety of devices, including smartphones, personal computers, and edge devices, and minimizes performance degradation.



Quantization Fundamentals with Hugging Face


Instructors: Younes Belkada
Access provided by Somaiya Vidyavihar University
(13 reviews)
Recommended experience
What you'll learn
- Learn how to compress models with the Hugging Face Transformers library and the Quanto library. 
- Learn about linear quantization, a simple yet effective method for compressing models. 
- Practice quantizing open source multimodal and language models. 
Skills you'll practice
Details to know
Only available on desktop
See how employees at top companies are mastering in-demand skills

Learn, practice, and apply job-ready skills in less than 2 hours
- Receive training from industry experts
- Gain hands-on experience solving real-world job tasks

About this project
Instructors


Offered by
How you'll learn
- Hands-on, project-based learning - Practice new skills by completing job-related tasks with step-by-step instructions. 
- No downloads or installation required - Access the tools and resources you need in a cloud environment. 
- Available only on desktop - This project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices. 
Why people choose Coursera for their career




You might also like
 - DeepLearning.AI 
 - DeepLearning.AI 
 - Coursera Instructor Network 
 - DeepLearning.AI 

