In Quantization in Depth you will build model quantization methods to shrink model weights to ¼ their original size, and apply methods to maintain the compressed model’s performance. Your ability to quantize your models can make them more accessible, and also faster at inference time.



Recommended experience
What you'll learn
- Try variants of Linear Quantization and granularities like per tensor, per channel, and per group quantization. 
- Build a general-purpose quantizer in Pytorch that can quantize the dense layers of any open source model for up to 4x compression on dense layers. 
- Implement weights packing to pack four 2-bit weights into a single 8-bit integer. 
Skills you'll practice
Details to know
Only available on desktop
See how employees at top companies are mastering in-demand skills

Learn, practice, and apply job-ready skills in less than 2 hours
- Receive training from industry experts
- Gain hands-on experience solving real-world job tasks

About this project
Instructors


Offered by
How you'll learn
- Hands-on, project-based learning - Practice new skills by completing job-related tasks with step-by-step instructions. 
- No downloads or installation required - Access the tools and resources you need in a cloud environment. 
- Available only on desktop - This project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices. 
Why people choose Coursera for their career




You might also like
 - DeepLearning.AI 
 - DeepLearning.AI 
 - Johns Hopkins University 


