Deep Learning for AI Part 1

Deep Learning for AI Part 1

Instructor: Xuemin Jin

Included with Learn more

Ask Coursera

7 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Some related experience required

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

7 modules

Gain insight into a topic and learn the fundamentals.

Intermediate level

Some related experience required

2 weeks to complete

at 10 hours a week

Flexible schedule

Learn at your own pace

Skills you'll gain

Tools you'll learn

Details to know

Shareable certificate

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 7 modules in this course

This is Part 1 of a two-part graduate sequence in deep learning. It establishes the foundations of modern deep learning and the core neural architectures behind today's AI systems. You will build from how neural networks learn—through forward propagation and backpropagation—to convolutional networks for computer vision, recurrent networks for sequence data, and the first generative architectures: variational autoencoders, generative adversarial networks, and Transformers. The course emphasizes both conceptual understanding and hands-on implementation in TensorFlow/Keras and PyTorch. Part 2 continues with advanced generative modeling.

Deep learning has transformed artificial intelligence by enabling models to learn hierarchical representations directly from raw data—dramatically outperforming traditional hand-engineered approaches across vision, language, and scientific domains. You will build the conceptual and practical vocabulary the entire course depends on: how neural networks are constructed, how training proceeds through forward and backward passes, and why deep learning is particularly suited to unstructured, high-dimensional data.

What's included

2 videos15 readings3 assignments

2 videosTotal 3 minutes

Why Deep Learning? Modern AI Applications2 minutes
Neural Networks2 minutes

15 readingsTotal 155 minutes

Course Introduction2 minutes
Syllabus - Deep Learning for AI Part 110 minutes
Meet Your Faculty1 minute
Academic Integrity1 minute
Deep Learning Overview and Motivation15 minutes
Real-World Applications Across Vision, Language, and Science10 minutes
Neurons, Layers, and the Network Structure5 minutes
Weights, Biases, and Learned Parameters10 minutes
The Forward Pass: Computing Predictions1 minute
Loss Functions for Classification and Reconstruction10 minutes
Backpropagation and the Chain Rule30 minutes
Optimization Algorithms: SGD, Momentum, AdaGrad, and Adam30 minutes
Choosing a Framework: TensorFlow vs. PyTorch10 minutes
Tensor Fundamentals: Scalars Through 3D+ Tensors and Tensor Attributes10 minutes
Discriminative vs. Generative Models: A Course Preview10 minutes

3 assignmentsTotal 90 minutes

Assess Your Learning: Why Deep Learning and Neural Network Architecture30 minutes
Assess Your Learning: Forward Propagation and Backpropagation30 minutes
Assess Your Learning: Frameworks, Tensors, and Discriminative vs. Generative Models30 minutes

Convolutional Neural Networks are the architectural backbone of modern computer vision and a component you will encounter repeatedly throughout this course—inside autoencoders, GANs, and diffusion model U-Nets. You will develop the ability to read, design, and reason about CNN architectures from filter-level convolution operations through landmark designs like VGG and ResNet, and learn how pretrained models can be adapted to new tasks through transfer learning.

What's included

1 video9 readings3 assignments

1 videoTotal 3 minutes

Batch Normalization, Dropout, and Activation Functions3 minutes

9 readingsTotal 105 minutes

Why Convolutional Neural Networks?10 minutes
Convolution Mechanics and Filter Visualization15 minutes
Padding Modes and Stride5 minutes
Max Pooling and Average Pooling10 minutes
Strided Convolution and Feature Map Interpretation5 minutes
Batch Normalization and Internal Covariate Shift10 minutes
Dropout Ratios and Activation Function Choices10 minutes
CNN Layer Flow and Architecture Patterns10 minutes
A Simple CNN Example: MNIST and CIFAR-1030 minutes

3 assignmentsTotal 90 minutes

Assess Your Learning: Why CNNs, Convolution, and Pooling30 minutes
Assess Your Learning: Batch Normalization, Dropout, and CNN Layer Flow30 minutes
Assess Your Learning: CNN Worked Example30 minutes

Computer vision is the field that enables machines to perceive and interpret visual information—the domain where deep learning first achieved superhuman performance. You will survey its core tasks, from image classification and object detection to semantic segmentation, then work through the full detection pipeline from the R-CNN family to YOLOv8, gaining enough architectural depth to understand how these systems are extended and fine-tuned for new domains.

What's included

10 readings3 assignments

10 readingsTotal 132 minutes

What Is Computer Vision? Goals, Scope, and Task Taxonomy10 minutes
Image and Video Data Types and Applications30 minutes
R-CNN and the Region Proposal Approach10 minutes
Fast R-CNN, Faster R-CNN, and Two-Stage Detection30 minutes
The YOLO Concept and Architecture Overview10 minutes
YOLOv8: Backbone, FPN Neck, and Detection Head10 minutes
YOLOv8: Loss Function and Non-Maximum Suppression10 minutes
Data Preparation and Training Walkthrough2 minutes
Feature Extraction vs. Fine-Tuning for Vision10 minutes
The Keras Pretrained Model API for Vision10 minutes

3 assignmentsTotal 90 minutes

Assess Your Learning: Computer Vision Tasks and R-CNN30 minutes
Assess Your Learning: YOLOv8 Architecture and Training30 minutes
Assess Your Learning: Transfer Learning for Vision30 minutes

The models you studied in earlier modules treat inputs as fixed-size, spatially arranged structures. Many real-world problems involve sequences where order matters and context accumulates over time: text, speech, time-series data, financial signals. You will learn how RNNs process sequences through a hidden state, how LSTMs and GRUs address the vanishing gradient problem, and why these architectures—and their failure modes—directly motivated the attention mechanism covered in the Transformer module.

What's included

12 readings3 assignments

12 readingsTotal 64 minutes

Why Recurrent Networks? Sequence Modeling Applications5 minutes
RNN vs. CNN: Handling Temporal Data10 minutes
The Hidden State Update and RNN Unrolling10 minutes
Backprop Through Time and Vanishing/Exploding Gradients5 minutes
LSTM Architecture: Forget, Input, and Output Gates10 minutes
The Cell State and Long-Range Memory in LSTMs2 minutes
GRU Architecture: Reset and Update Gates1 minute
GRU vs. LSTM: Trade-offs and Selection Criteria1 minute
The IMDB Dataset and One-Hot Encoding5 minutes
Word Embeddings and Embedding Layers5 minutes
Building the LSTM Model for Sentiment Analysis5 minutes
Training, Evaluation, and Results5 minutes

3 assignmentsTotal 90 minutes

Assess Your Learning: Introduction to RNNs and Backprop Through Time30 minutes
Assess Your Learning: LSTM and GRU30 minutes
Assess Your Learning: Text Data Handling and LSTM Sentiment Classification30 minutes

This module marks the course's inflection point: the shift from discriminative models that learn decision boundaries to generative models that learn to synthesize new data. You will survey the full generative landscape—VAEs, GANs, autoregressive models, normalizing flows, diffusion models, and energy-based models—before diving into the autoencoder and its probabilistic extension, the Variational Autoencoder.

What's included

1 video14 readings4 assignments

1 videoTotal 3 minutes

Transposed Convolution3 minutes

14 readingsTotal 85 minutes

Generative vs. Discriminative Models2 minutes
Challenges in Generative Modeling and a Toy Generative Model2 minutes
Representation Learning and Probability Theory Review2 minutes
Generative Model Taxonomy: VAEs, GANs, Flows, Diffusion, EBMs5 minutes
Autoencoder Motivation and Architecture Overview2 minutes
Building the Encoder and Decoder10 minutes
Transposed Convolution for Decoding10 minutes
The Probabilistic Extension: From AE to VAE5 minutes
The Reparameterization Trick10 minutes
The ELBO Loss Function1 minute
KL Divergence and Regularizing the Latent Space2 minutes
VAE vs. Autoencoder: Key Differences2 minutes
Face Generation and Latent Space Arithmetic30 minutes
Interpolating and Morphing in Latent Space2 minutes

4 assignmentsTotal 120 minutes

Assess Your Learning: Introduction to Generative Modeling and Representation Learning30 minutes
Assess Your Learning: Autoencoders and Latent Space Exploration30 minutes
Assess Your Learning: VAE Probabilistic Framework and Loss30 minutes
Assess Your Learning: VAE vs. Autoencoder and Worked Example30 minutes

Generative Adversarial Networks take a fundamentally different approach to generative modeling: rather than maximizing a likelihood objective, two networks train in competition. You will work through the full GAN toolkit—from Deep Convolutional GANs and training stabilization techniques to Wasserstein distance, gradient penalty, conditional generation, and cycle-consistent domain translation.

What's included

10 readings3 assignments

10 readingsTotal 52 minutes

The Adversarial Framework: Generator and Discriminator5 minutes
GAN Types, Applications, and Ethical Considerations5 minutes
DCGAN Architecture and Design Principles10 minutes
DCGAN Training: Fashion MNIST and Lego Bricks Examples5 minutes
GAN Training Instability and Mode Collapse5 minutes
Stabilization Techniques: Normalization, Learning Rate, Label Smoothing2 minutes
Wasserstein Distance and the WGAN Objective5 minutes
Gradient Penalty and WGAN-GP Training Results5 minutes
Conditional GAN Architecture and Class Conditioning5 minutes
CycleGAN and Unpaired Domain Translation5 minutes

3 assignmentsTotal 90 minutes

Assess Your Learning: What Are GANs and Deep Convolutional GANs30 minutes
Assess Your Learning: GAN Training Tips and WGAN-GP30 minutes
Assess Your Learning: Conditional GANs and CycleGAN30 minutes

Introduced in "Attention Is All You Need" (Vaswani et al., 2017), the Transformer is arguably the most consequential architectural development in deep learning since the CNN. You will derive the attention mechanism from first principles—Query, Key, Value, scaled dot-product, multi-head attention—assemble the full architecture with positional encoding and causal masking, and see it applied in a GPT-style language model.

What's included

1 video11 readings3 assignments

1 videoTotal 2 minutes

The Attention Mechanism2 minutes

11 readingsTotal 102 minutes

Why Transformers? Advantages Over RNNs and CNNs5 minutes
GPT Overview and Key Transformer Applications2 minutes
What Is Attention? The Concept and Intuition5 minutes
Attention Continued5 minutes
Self-Attention and Network Parameters10 minutes
Multi-Head Attention and Parallel Representation Learning10 minutes
Positional Encoding: Sinusoidal and Learned10 minutes
Causal Masking for Autoregressive Generation10 minutes
Building a GPT-Style Language Model10 minutes
Training, Generating, and Evaluating Text30 minutes
Congratulations! 5 minutes

3 assignmentsTotal 90 minutes

Assess Your Learning: What Is a Transformer and the Attention Mechanism30 minutes
Assess Your Learning: Multi-Head Attention and Positional Encoding30 minutes
Assess Your Learning: GPT-Style Application30 minutes

Instructor

Xuemin Jin

Northeastern University

8 Courses1,273 learners

Offered by

Northeastern University

Explore more from Machine Learning

Status: Preview
Northeastern University
Deep Learning for AI Part 2
Course
Status: Free Trial
Pearson
Learning Deep Learning: Unit 1
Course
Status: Free Trial
Packt
Deep Learning & Modern AI Architectures
Course
Status: Free Trial
DeepLearning.AI
Neural Networks and Deep Learning
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Unlock access to 10,000+ courses with a subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 4,700 global companies that choose Coursera for Business

Frequently asked questions

To access course materials, assignments, and earn a Certificate, you'll need to purchase the Certificate experience when you enroll in a course. Eligible learners may also have the option to start with a Free Trial. Some courses may also offer a Full Course, No Certificate option. This lets you access course materials, submit required assessments, and receive a final grade, but you won't be able to earn or purchase a Certificate.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.