Machine Learning with Small Data Part 1

Machine Learning with Small Data Part 1

Instructor: Sarah Ostadabbas

Access provided by EDGE Group

7 modules

Gain insight into a topic and learn the fundamentals.

1 week to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

7 modules

Gain insight into a topic and learn the fundamentals.

1 week to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

Skills you'll gain

Tools you'll learn

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

8 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 7 modules in this course

This course addresses the challenge of machine learning (ML) in the context of small datasets, a significant issue due to ML's increasing data demands. Despite ML's success in various fields, many areas can't provide large labeled datasets because of costs, privacy, or security laws. As big data becomes standard, efficiently learning from smaller datasets is crucial. This course, ideal for graduate students with some ML experience, focuses on modern deep learning techniques for small data applications relevant in healthcare, military, and various industry sectors. Prerequisites include ML familiarity and Python proficiency. Deep learning experience is not necessary but beneficial.

In this module, we will explore the pivotal role of data as the foundation for machine learning algorithms. We begin by discussing the significance of large datasets in training deep learning models as these datasets are crucial for the models’ successful application and effectiveness. We will also delve into the challenges associated with small datasets, particularly in sensitive fields such as healthcare and defense, where data acquisition is often difficult, costly, or subject to stringent privacy and security regulations. To address these challenges, the course will introduce various strategies for making the most of limited data, including data-efficient machine learning techniques and the use of synthetic data augmentation. Additionally, we will present the course structure and discuss a curated selection of research papers that align with and enrich our course topics.

What's included

2 videos13 readings1 assignment

2 videosTotal 16 minutes

Data Matters8 minutes
Setting Up Your Local Environment8 minutes

13 readingsTotal 81 minutes

Course Overview1 minute
Syllabus - Machine Learning for Small Data10 minutes
Academic Integrity1 minute
Data Matters—Especially for Deep Learning2 minutes
Data-Parameters-Power Scaling in AI Model5 minutes
Exponential Growth of Training Data10 minutes
Exponential Growth of Model Complexity5 minutes
Exponential Growth in Computational Resources5 minutes
The Scale Paradox: When Smaller ML Models Outperform Giants5 minutes
Large Datasets for Deep Learning10 minutes
What is Small Data?2 minutes
Installing PyTorch5 minutes
Large vs. Small Datasets in Machine Learning20 minutes

1 assignmentTotal 10 minutes

Module 1 Quiz10 minutes

In this module, we will delve into the core aspects of machine learning with a focus on the importance of data, particularly in deep learning applications. We start by emphasizing how large datasets are essential for training deep learning models effectively, as they enable the models to capture and learn from complex patterns, improving their overall performance. Additionally, we'll explore the intersection of data availability, computational power, and model capacity, highlighting how these elements interact to refine model accuracy and efficiency. Furthermore, the module will cover computing advancements beyond Moore's Law and their impact on machine learning, illustrating how modern hardware like CPUs, GPUs, and TPUs enhance computational capabilities critical for training sophisticated models. We'll also delve into scaling laws in deep learning, discussing empirical findings that show how model performance improves predictably with increases in dataset size and model complexity, although with diminishing returns. To provide a deeper theoretical foundation, we'll examine the Vapnik-Chervonenkis (VC) theory, which offers insights into how learning curves and model complexity relate to a model’s ability to generalize from training data. This discussion will extend to practical applications and theoretical limitations, helping to frame machine learning challenges in terms of data sufficiency, model fitting, and the balance between bias and variance. By the end of this module, students will have a thorough understanding of the dynamic interplay between these factors and their implications for machine learning practice and research.

What's included

1 video19 readings2 assignments1 app item

1 videoTotal 9 minutes

Machine Learning Model Performance9 minutes

19 readingsTotal 144 minutes

Ingredients Relationship10 minutes
Computing Power: Growth Beyond Moore’s Law10 minutes
Scaling Laws5 minutes
Learning Curves15 minutes
Model Capacity Required to Fit Data3 minutes
Model Performance and Dataset Size2 minutes
Model Performance and Model Capacity2 minutes
Bias-Variance Trade-Off15 minutes
From a Linear Algebra Perspective2 minutes
Underdetermined Problems and Overparameterized Models8 minutes
Revisiting Bias-Variance with Double Descent8 minutes
Comparison of Learning Paradigms15 minutes
A Learning Machine2 minutes
How Do We Characterize Model Complexity?1 minute
Vapnik–Chervonenkis (VC) Dimension - Shattering10 minutes
Notions of VC Dimension10 minutes
Examples of Shattering and VC Dimension10 minutes
VC Dimension in Neural Networks15 minutes
Resources1 minute

2 assignmentsTotal 60 minutes

Module 2 Quiz30 minutes
Calculating the VC Dimension of SVM Models30 minutes

1 app itemTotal 10 minutes

Examples of Learning Machines10 minutes

In this module, we’ll explore transfer learning and its role in data-efficient machine learning, where models leverage knowledge from previous tasks to improve performance on new, related tasks. We’ll also cover various types of transfer learning, including transductive, inductive, and unsupervised methods, each addressing different challenges and applications. We’ll discuss some practical steps for implementing transfer learning, such as selecting and fine-tuning pre-trained models, to reduce reliance on large datasets. We’ll also examine data-driven and physics-based simulations for data augmentation, highlighting their use in enhancing training under constrained conditions. Finally, we’ll review key papers on transfer learning techniques to address data scarcity and improve model performance.

What's included

1 video15 readings1 assignment

1 videoTotal 6 minutes

Transfer Learning6 minutes

15 readingsTotal 72 minutes

Data-efficient Machine Learning10 minutes
Leveraging Pre-trained Models for Efficient Machine Learning2 minutes
Vanilla Transfer Learning 2 minutes
Types of Transfer Learning2 minutes
Transductive Transfer Learning Algorithms10 minutes
Inductive Transfer Learning Algorithms10 minutes
Transductive Examples I5 minutes
Transductive Examples II5 minutes
Transductive Examples III5 minutes
Inductive Examples5 minutes
Multi-Task Learning & Meta-Learning5 minutes
Synthetic Data Augmentation2 minutes
Data-Driven Simulation3 minutes
Physics-Based Simulation2 minutes
Physics-Based Simulation Examples4 minutes

1 assignmentTotal 15 minutes

Module 3 Quiz15 minutes

In this module, you'll explore the concept of domain adaptation, a key aspect of transductive transfer learning. Domain adaptation helps you train models that perform well on a target domain, even when its data distribution differs from the source domain. You'll learn about the challenges of domain shift and labeled data scarcity and how these can impact model performance. We'll cover different types of domain adaptation, including unsupervised, semi-supervised, and supervised approaches. You'll also dive into techniques like Deep Domain Confusion (DDC), which integrates domain confusion loss into neural networks to create domain-invariant features. Additionally, you'll discover advanced methods such as Domain-Adversarial Neural Networks (DANNs), Correlation Alignment (CORAL), and Deep Adaptation Networks (DANs) that build on DDC to enhance domain adaptation by aligning feature distributions and capturing complex dependencies across network layers.

What's included

1 video10 readings1 assignment

1 videoTotal 6 minutes

Domain Adaptation6 minutes

10 readingsTotal 143 minutes

Domain Adaptation: Background1 minute
Unsupervised, Semi-Supervised & Supervised10 minutes
Deep Domain Confusion8 minutes
Related Work Based on DDC2 minutes
Deep Domain Confusion Architecture10 minutes
Implementation & Architecture10 minutes
Mathematical Formulation5 minutes
An Example Dataset: Office-312 minutes
An Example DDC Experiment5 minutes
Transfer Learning Practice Activity90 minutes

1 assignmentTotal 10 minutes

Module 4 Quiz10 minutes

In this module, we’ll explore weak supervision, a technique for training machine learning models with limited, noisy, or imprecise labels. You'll learn about different types of weak supervision and why they are crucial in small data domains. We’ll cover techniques such as semi-supervised learning, self-supervised learning, and active learning, along with advanced methods such as Temporal Ensembling and the Mean Teacher approach. Additionally, you'll discover Bayesian deep learning and active learning strategies to improve training efficiency. Finally, you'll see real-world applications in fields like medical imaging, NLP, fraud detection, autonomous driving, and biology.

What's included

1 video8 readings1 assignment

1 videoTotal 7 minutes

What is Weak Supervision?7 minutes

8 readingsTotal 54 minutes

Types of Weak Supervision6 minutes
Semi-Supervised Learning10 minutes
Self-Supervised Learning15 minutes
Active Learning6 minutes
Applications of Weak Supervision2 minutes
Case Study: Medical Imaging5 minutes
Case Study: Autonomous Driving5 minutes
Case Study: Natural Language Processing5 minutes

1 assignmentTotal 30 minutes

Module 5 Quiz30 minutes

In this module, you'll explore how Zero-Shot Learning (ZSL) enables models to recognize new categories without having seen any examples of those categories during training. This is achieved by leveraging intermediate semantic descriptions, such as attributes, shared between seen and unseen classes. You'll also learn about the importance of regularization in preventing overfitting and improving generalization, as well as how generative models like GANs and VAEs enhance ZSL by synthesizing unseen class data. Additionally, we'll examine Generalized Zero-Shot Learning (GZSL), which tests models on both seen and unseen classes, making the task more challenging and realistic. By the end of this module, you'll have a solid understanding of how ZSL and its extensions can be applied to various machine learning tasks.

What's included

1 video9 readings1 assignment

1 videoTotal 5 minutes

Generalized Zero-Shot Learning5 minutes

9 readingsTotal 71 minutes

Introduction to Zero-Shot Learning3 minutes
ZSL: Notation and Problem Setup3 minutes
Learning a Linear Predictor for Seen Classes10 minutes
Problem Extension for ZSL: From Seen to Unseen Classes15 minutes
An Embarrassingly Simple Approach to ZSL10 minutes
ZSL with Generative Models10 minutes
Generalized Zero-Shot Learning (GZSL)10 minutes
Zero-Shot Learning: Semantic Autoencoders5 minutes
Generalized ZSL With Generative Models5 minutes

1 assignmentTotal 30 minutes

Module 6 Quiz30 minutes

This module focuses on Few-Shot Learning (FSL), a critical paradigm in machine learning that enables models to classify new examples with only a small number of labeled instances. Unlike traditional deep learning models that require vast amounts of labeled data, FSL mimics the human ability to generalize from limited examples, making it highly useful for tasks like image classification, object detection, and natural language processing (NLP). The lecture introduces Matching Networks, a metric-based learning approach designed to solve one-shot learning problems by learning a similarity function that maps new examples to previously seen labeled instances. Students will gain an in-depth understanding of how nearest-neighbor approaches, differentiable embedding functions, and attention mechanisms help in optimizing few-shot learning models. Through discussions, theoretical formulations, and real-world applications, this lecture equips students with practical insights into how AI can function effectively in data-scarce environments.