Wenn Sie sich für diesen Kurs anmelden, werden Sie auch für dieses berufsbezogene Zertifikat angemeldet.
Lernen Sie neue Konzepte von Branchenexperten
Gewinnen Sie ein Grundverständnis bestimmter Themen oder Tools
Erwerben Sie berufsrelevante Kompetenzen durch praktische Projekte
Erwerben Sie ein Berufszertifikat von Coursera zur Vorlage
In diesem Kurs gibt es 3 Module
The Model Evaluation and Benchmarking course is designed for developers, engineers, and technical product builders who are new to Generative AI but already have intermediate machine learning knowledge, basic Python proficiency, and familiarity with development environments such as VS Code, and who want to engineer, customize, and deploy open generative AI solutions while avoiding vendor lock-in.
The course equips learners with the skills to assess and compare the performance of both text and image generative models. Starting with text evaluation, learners apply standard metrics such as perplexity, BLEU (Bilingual Evaluation Understudy), ROUGE (Recall-Oriented Understudy for Gisting Evaluation), and BERTScore, while also designing human evaluation protocols and task-specific methods for applications like summarization or translation. The course then explores image evaluation using technical metrics, including FID (Fréchet Inception Distance), CLIP similarity (Contrastive Language–Image Pretraining similarity), and SSIM (Structural Similarity Index Measure), alongside human perception-based assessment techniques and artifact detection systems. In the final module, learners design comprehensive benchmarking frameworks with reproducible testing environments, version control, and visualization dashboards for continuous monitoring. By the end, learners will be able to implement automated, domain-specific evaluation systems and deliver detailed performance reports that ensure generative models meet rigorous quality standards.
Learn how to evaluate text models using both automated metrics and human-centered methods. You’ll apply key measures like perplexity, BLEU (Bilingual Evaluation Understudy), ROUGE (Recall-Oriented Understudy for Gisting Evaluation), and BERTScore, and understand when each is most useful. You’ll also design human evaluation protocols and build automated pipelines, giving you a practical way to judge whether your fine-tuned models improve performance.
Das ist alles enthalten
4 Videos2 Lektüren1 Aufgabe1 Unbewertetes Labor
Infos zu Modulinhalt anzeigen
4 Videos•Insgesamt 26 Minuten
Podcast: The Problems Text Metrics Were Built to Solve•3 Minuten
Your First Evaluation Pipeline with Hugging Face•8 Minuten
Advanced Evaluation: Human Feedback and Comprehensive Reporting•5 Minuten
Why Statistical Testing Matters•10 Minuten
2 Lektüren•Insgesamt 34 Minuten
Code Demonstration Transcripts•4 Minuten
Your Essential Toolkit: Metrics for Text Evaluation•30 Minuten
1 Aufgabe•Insgesamt 30 Minuten
Choosing the Best Metric for the Task•30 Minuten
1 Unbewertetes Labor•Insgesamt 60 Minuten
Run Your First Text Model Evaluation•60 Minuten
Image Quality Assessment Methods
Modul 2•2 Stunden abzuschließen
Moduldetails
Explore how to measure the quality of images produced by diffusion and other generative models. You’ll implement technical metrics like Fréchet Inception Distance (FID), Structural Similarity Index Measure (SSIM), and Contrastive Language–Image Pretraining (CLIP) similarity, and balance them with human perception-based checks for style, accuracy, and consistency. You’ll also automate artifact detection and quality control, equipping you with the skills to catch hidden flaws and ensure your image outputs meet professional standards.
Das ist alles enthalten
3 Videos1 Lektüre1 Unbewertetes Labor
Infos zu Modulinhalt anzeigen
3 Videos•Insgesamt 23 Minuten
Podcast: The Hidden Problems Image Metrics Reveal•5 Minuten
Evaluating & Automating Image Quality with TorchMetrics•10 Minuten
The Must-Know Metrics for Image Quality•30 Minuten
1 Unbewertetes Labor•Insgesamt 60 Minuten
Run Your First Image Model Evaluation•60 Minuten
Creating Benchmarking Frameworks
Modul 3•3 Stunden abzuschließen
Moduldetails
Learn how to design benchmarks that make model comparisons reliable and reproducible. You’ll create domain-specific evaluation datasets, build dashboards to visualize results, and automate reporting systems for continuous monitoring. These practices help you track improvements, catch performance issues early, and build trust in your work through transparent, repeatable evaluations.
Das ist alles enthalten
3 Videos1 Lektüre1 Aufgabe1 Unbewertetes Labor
Infos zu Modulinhalt anzeigen
3 Videos•Insgesamt 15 Minuten
Podcast: The Value of Benchmarks in AI Workflows•6 Minuten
Turning Model Outputs into Meaningful Comparisons•7 Minuten
Podcast: Bringing It All Together: Benchmarking That Builds Trust•2 Minuten
1 Lektüre•Insgesamt 15 Minuten
How to Design Benchmarks That Matter•15 Minuten
1 Aufgabe•Insgesamt 60 Minuten
End-to-End Benchmarking Check•60 Minuten
1 Unbewertetes Labor•Insgesamt 60 Minuten
Run a Mini-Benchmark•60 Minuten
Erwerben Sie ein Karrierezertifikat.
Fügen Sie dieses Zeugnis Ihrem LinkedIn-Profil, Lebenslauf oder CV hinzu. Teilen Sie sie in Social Media und in Ihrer Leistungsbeurteilung.
Coursera brings together a diverse network of subject matter experts who have demonstrated their expertise through professional industry experience or strong academic backgrounds. These instructors design and teach courses that make practical, career-relevant skills accessible to learners worldwide.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Certificate?
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.