Learners will gain the skills to serve powerful language models as practical and scalable web APIs. They will learn how to use the llama.cpp example server to expose a large language model through a set of REST API endpoints for tasks like text generation, tokenization, and embedding extraction.



Beginning Llamafile for Local Large Language Models (LLMs)

Instructors: Noah Gift
Access provided by Ecole Supérieure des Industries du Textile et de l'Habillement
Recommended experience
What you'll learn
Learn how to serve large language models as production-ready web APIs using the llama.cpp framework
Understand the architecture and capabilities of the llama.cpp example server for text generation, tokenization, and embedding extraction
Gain hands-on experience in configuring and customizing the server using command line options and API parameters
Skills you'll gain
Details to know

Add to your LinkedIn profile
4 assignments
See how employees at top companies are mastering in-demand skills

There is 1 module in this course
This week, you run language models locally. Keep data private. Avoid latency and fees. Use Mixtral model and llamafile.
What's included
8 videos19 readings4 assignments1 discussion prompt4 ungraded labs
Offered by
Why people choose Coursera for their career




Explore more from Computer Science

Duke University

University of Michigan

Duke University


