Introduction to Deep Learning

Introduction to Deep Learning

This course is part of Machine Learning: Theory and Hands-on Practice with Python Specialization

Instructor: Daniel E. Acuna

Access provided by Masterflex LLC, Part of Avantor

5 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

5 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Recommended experience

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

What you'll learn

Explain the mathematical foundations of neural networks and how they learn from data.
Train and regularize deep neural networks for effective generalization.
Design and apply specialized neural network architectures for images and sequences.
Apply transformer-based and multimodal models to real-world scenarios.

Skills you'll gain

Network Model
PyTorch (Machine Learning Library)
Natural Language Processing
Network Architecture
Recurrent Neural Networks (RNNs)
Artificial Intelligence and Machine Learning (AI/ML)
Keras (Neural Network Library)
Large Language Modeling
Vision Transformer (ViT)
Embeddings
Skills section collapsed. Showing 6 of 10 skills.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

6 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Machine Learning: Theory and Hands-on Practice with Python Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

There are 5 modules in this course

Introduction to Deep Learning provides a rigorous, concept-driven introduction to the models that power modern AI systems—from image recognition to large language models. You’ll build neural networks from first principles, understanding how forward passes, loss functions, and backpropagation enable learning. As the course progresses, you’ll train and regularize deep models, design convolutional networks for vision, model sequences with RNNs, LSTMs, and attention, and apply transformer-based architectures such as BERT, GPT, and Vision Transformers. You will also look at the latest trends in contrastive learning and CLIP. By combining mathematical foundations with practical application, this course equips you to understand, train, and use deep learning models with confidence.

This course can be taken for academic credit as part of CU Boulder’s Masters of Science in Computer Science (MS-CS), Master of Science in Artificial Intelligence (MS-AI), and Master of Science in Data Science (MS-DS) degrees offered on the Coursera platform. These fully accredited graduate degrees offer targeted courses, short 8-week sessions, and pay-as-you-go tuition. Admission is based on performance in three preliminary courses, not academic history. CU degrees on Coursera are ideal for recent graduates or working professionals. Learn more: MS in Artificial Intelligence: https://www.coursera.org/degrees/ms-artificial-intelligence-boulder MS in Computer Science: https://coursera.org/degrees/ms-computer-science-boulder MS in Data Science: https://www.coursera.org/degrees/master-of-science-data-science-boulder

Welcome to Introduction to Deep Learning. This module builds the mathematical foundations of neural networks. Starting from linear models, you will learn about the artificial neuron and develop the mathematics of gradient descent and backpropagation. The focus is on understanding how and why neural networks work through the underlying math—covering the forward pass, loss functions, and the chain rule to show how information flows through networks and how they learn from data.

What's included

15 videos5 readings2 assignments1 programming assignment

15 videos Total 104 minutes

Machine Learning Introduction 2 minutes
Deep Learning Introduction 2 minutes
Academic Integrity and AI Use Policy for the Machine Learning Specialization 9 minutes
From Linear Regression to the Artificial Neuron 9 minutes
Activation Functions and Non-Linearity: The Mathematical Notation and Problem Setup 5 minutes
Activation Functions and Non-Linearity: Why Non-Linearity is Important 6 minutes
Activation Functions and Non-Linearity: Sigmoid Activation and its Gradient 10 minutes
Activation Functions and Non-Linearity: Rectified Linear Unit Activation and its Gradient 4 minutes
Activation Functions and Non-Linearity: Other Activations and How to Choose Among Them 4 minutes
Layers, Depth, and Forward Propagation 10 minutes
Matrix Notation and Dimensions 9 minutes
Loss Functions: MSE and Cross-Entropy 7 minutes
Gradient Descent: The Math of Optimization 8 minutes
The Chain Rule and Backpropagation 9 minutes
Backpropagation Through a Network 10 minutes

5 readings Total 95 minutes

Earn Academic Credit for Your Work! 10 minutes
Course Support 10 minutes
Assessment Expectations 5 minutes
Download the Recommended Reading for This Course 10 minutes
From Linear Models to Neural Networks - Recommended Reading 60 minutes

2 assignments Total 35 minutes

AI Policy Quiz 5 minutes
Neural Network Foundations 30 minutes

1 programming assignment Total 60 minutes

Lab 1: Building and Training Your First Neural Network in Keras 60 minutes

This module focuses on training neural networks effectively. Topics include optimization algorithms, hyperparameter tuning, and regularization techniques to prevent overfitting and achieve good generalization. You will compare different optimizers like SGD, momentum, and Adam, understand how learning rate and batch size affect training dynamics, and apply weight decay, dropout, early stopping, and batch normalization.

What's included

7 videos2 readings1 assignment1 programming assignment

7 videos Total 43 minutes

SGD, Momentum, and Adam 8 minutes
Learning Rate and Batch Size 7 minutes
Epochs and Monitoring Training 6 minutes
Understanding Overfitting 6 minutes
L2 Regularization (Weight Decay) 6 minutes
Dropout 5 minutes
Early Stopping and Batch Normalization 7 minutes

2 readings Total 90 minutes

Optimization Algorithms - Recommended Reading 60 minutes
Regularization Techniques - Recommended Reading 30 minutes

1 assignment Total 30 minutes

Training and Regularizing Neural Networks 30 minutes

1 programming assignment Total 60 minutes

Lab 2: Applying Regularization to Improve Model Generalization 60 minutes

This module introduces you to convolutional neural networks (CNNs), the foundation of modern computer vision. Topics include how convolutional and pooling layers work, CNN architecture design, and practical techniques like data augmentation and transfer learning. The module covers classic architectures like VGG and ResNet and explains why CNNs outperform fully-connected networks on image data.

What's included

7 videos2 readings1 assignment1 programming assignment

7 videos Total 51 minutes

Why CNNs for Images? 8 minutes
The Convolution Operation 8 minutes
Pooling Layers 5 minutes
CNN Architecture: Conv → Pool → Dense 7 minutes
VGG, ResNet, and Skip Connections 9 minutes
Data Augmentation 6 minutes
Transfer Learning 7 minutes

2 readings Total 75 minutes

Introduction to CNNs - Recommended Reading 45 minutes
Training CNNs in Practice - Recommended Reading 30 minutes

1 assignment Total 30 minutes

Convolutional Neural Networks for Image Recognition 30 minutes

1 programming assignment Total 60 minutes

Lab 3: Training a CNN for Image Classification with Augmentation 60 minutes

This module covers sequence modeling, starting with recurrent neural networks (RNNs) and long short-term memory networks (LSTMs), then progressing to the attention mechanism—the key innovation that led to transformers. Topics include how RNNs maintain hidden states across time steps, why the vanishing gradient problem motivated LSTMs, and how attention allows models to focus on relevant parts of their input.

What's included

7 videos1 reading1 assignment1 programming assignment

7 videos Total 48 minutes

Sequential Data and RNN Architecture 7 minutes
Hidden State and Backprop Through Time 7 minutes
LSTM Architecture 7 minutes
The Attention Mechanism 8 minutes
Self-Attention and QKV 8 minutes
Multi-Head Attention 4 minutes
Positional Encoding 6 minutes

1 reading Total 60 minutes

Recurrent Neural Networks - Recommended Reading 60 minutes

1 assignment Total 30 minutes

Sequence Modeling – RNNs, LSTMs, and the Attention Mechanism 30 minutes

1 programming assignment Total 60 minutes

Lab 4: Building a Sequence Model with Attention 60 minutes

This final module covers the transformer architecture, which has revolutionized deep learning across domains. Topics include BERT and GPT as encoder-only and decoder-only variants, Vision Transformers (ViT) that apply attention to images, and CLIP for multimodal learning connecting vision and language. The module emphasizes applying pre-trained models to real tasks.