This is Part 1 of a two-part graduate sequence in deep learning. It establishes the foundations of modern deep learning and the core neural architectures behind today's AI systems. You will build from how neural networks learn—through forward propagation and backpropagation—to convolutional networks for computer vision, recurrent networks for sequence data, and the first generative architectures: variational autoencoders, generative adversarial networks, and Transformers. The course emphasizes both conceptual understanding and hands-on implementation in TensorFlow/Keras and PyTorch. Part 2 continues with advanced generative modeling.

Deep Learning for AI Part 1

Details to know

Add to your LinkedIn profile
June 2026
22 assignments
See how employees at top companies are mastering in-demand skills

There are 7 modules in this course
Deep learning has transformed artificial intelligence by enabling models to learn hierarchical representations directly from raw data—dramatically outperforming traditional hand-engineered approaches across vision, language, and scientific domains. You will build the conceptual and practical vocabulary the entire course depends on: how neural networks are constructed, how training proceeds through forward and backward passes, and why deep learning is particularly suited to unstructured, high-dimensional data.
What's included
2 videos15 readings3 assignments
2 videos•Total 3 minutes
- Why Deep Learning? Modern AI Applications•2 minutes
- Neural Networks•2 minutes
15 readings•Total 155 minutes
- Course Introduction•2 minutes
- Syllabus - Deep Learning for AI Part 1•10 minutes
- Meet Your Faculty•1 minute
- Academic Integrity•1 minute
- Deep Learning Overview and Motivation•15 minutes
- Real-World Applications Across Vision, Language, and Science•10 minutes
- Neurons, Layers, and the Network Structure•5 minutes
- Weights, Biases, and Learned Parameters•10 minutes
- The Forward Pass: Computing Predictions•1 minute
- Loss Functions for Classification and Reconstruction•10 minutes
- Backpropagation and the Chain Rule•30 minutes
- Optimization Algorithms: SGD, Momentum, AdaGrad, and Adam•30 minutes
- Choosing a Framework: TensorFlow vs. PyTorch•10 minutes
- Tensor Fundamentals: Scalars Through 3D+ Tensors and Tensor Attributes•10 minutes
- Discriminative vs. Generative Models: A Course Preview•10 minutes
3 assignments•Total 90 minutes
- Assess Your Learning: Why Deep Learning and Neural Network Architecture•30 minutes
- Assess Your Learning: Forward Propagation and Backpropagation•30 minutes
- Assess Your Learning: Frameworks, Tensors, and Discriminative vs. Generative Models•30 minutes
Convolutional Neural Networks are the architectural backbone of modern computer vision and a component you will encounter repeatedly throughout this course—inside autoencoders, GANs, and diffusion model U-Nets. You will develop the ability to read, design, and reason about CNN architectures from filter-level convolution operations through landmark designs like VGG and ResNet, and learn how pretrained models can be adapted to new tasks through transfer learning.
What's included
1 video9 readings3 assignments
1 video•Total 3 minutes
- Batch Normalization, Dropout, and Activation Functions•3 minutes
9 readings•Total 105 minutes
- Why Convolutional Neural Networks?•10 minutes
- Convolution Mechanics and Filter Visualization•15 minutes
- Padding Modes and Stride•5 minutes
- Max Pooling and Average Pooling•10 minutes
- Strided Convolution and Feature Map Interpretation•5 minutes
- Batch Normalization and Internal Covariate Shift•10 minutes
- Dropout Ratios and Activation Function Choices•10 minutes
- CNN Layer Flow and Architecture Patterns•10 minutes
- A Simple CNN Example: MNIST and CIFAR-10•30 minutes
3 assignments•Total 90 minutes
- Assess Your Learning: Why CNNs, Convolution, and Pooling•30 minutes
- Assess Your Learning: Batch Normalization, Dropout, and CNN Layer Flow•30 minutes
- Assess Your Learning: CNN Worked Example•30 minutes
Computer vision is the field that enables machines to perceive and interpret visual information—the domain where deep learning first achieved superhuman performance. You will survey its core tasks, from image classification and object detection to semantic segmentation, then work through the full detection pipeline from the R-CNN family to YOLOv8, gaining enough architectural depth to understand how these systems are extended and fine-tuned for new domains.
What's included
10 readings3 assignments
10 readings•Total 132 minutes
- What Is Computer Vision? Goals, Scope, and Task Taxonomy•10 minutes
- Image and Video Data Types and Applications•30 minutes
- R-CNN and the Region Proposal Approach•10 minutes
- Fast R-CNN, Faster R-CNN, and Two-Stage Detection•30 minutes
- The YOLO Concept and Architecture Overview•10 minutes
- YOLOv8: Backbone, FPN Neck, and Detection Head•10 minutes
- YOLOv8: Loss Function and Non-Maximum Suppression•10 minutes
- Data Preparation and Training Walkthrough•2 minutes
- Feature Extraction vs. Fine-Tuning for Vision•10 minutes
- The Keras Pretrained Model API for Vision•10 minutes
3 assignments•Total 90 minutes
- Assess Your Learning: Computer Vision Tasks and R-CNN•30 minutes
- Assess Your Learning: YOLOv8 Architecture and Training•30 minutes
- Assess Your Learning: Transfer Learning for Vision•30 minutes
The models you studied in earlier modules treat inputs as fixed-size, spatially arranged structures. Many real-world problems involve sequences where order matters and context accumulates over time: text, speech, time-series data, financial signals. You will learn how RNNs process sequences through a hidden state, how LSTMs and GRUs address the vanishing gradient problem, and why these architectures—and their failure modes—directly motivated the attention mechanism covered in the Transformer module.
What's included
12 readings3 assignments
12 readings•Total 64 minutes
- Why Recurrent Networks? Sequence Modeling Applications•5 minutes
- RNN vs. CNN: Handling Temporal Data•10 minutes
- The Hidden State Update and RNN Unrolling•10 minutes
- Backprop Through Time and Vanishing/Exploding Gradients•5 minutes
- LSTM Architecture: Forget, Input, and Output Gates•10 minutes
- The Cell State and Long-Range Memory in LSTMs•2 minutes
- GRU Architecture: Reset and Update Gates•1 minute
- GRU vs. LSTM: Trade-offs and Selection Criteria•1 minute
- The IMDB Dataset and One-Hot Encoding•5 minutes
- Word Embeddings and Embedding Layers•5 minutes
- Building the LSTM Model for Sentiment Analysis•5 minutes
- Training, Evaluation, and Results•5 minutes
3 assignments•Total 90 minutes
- Assess Your Learning: Introduction to RNNs and Backprop Through Time•30 minutes
- Assess Your Learning: LSTM and GRU•30 minutes
- Assess Your Learning: Text Data Handling and LSTM Sentiment Classification•30 minutes
This module marks the course's inflection point: the shift from discriminative models that learn decision boundaries to generative models that learn to synthesize new data. You will survey the full generative landscape—VAEs, GANs, autoregressive models, normalizing flows, diffusion models, and energy-based models—before diving into the autoencoder and its probabilistic extension, the Variational Autoencoder.
What's included
1 video14 readings4 assignments
1 video•Total 3 minutes
- Transposed Convolution•3 minutes
14 readings•Total 85 minutes
- Generative vs. Discriminative Models•2 minutes
- Challenges in Generative Modeling and a Toy Generative Model•2 minutes
- Representation Learning and Probability Theory Review•2 minutes
- Generative Model Taxonomy: VAEs, GANs, Flows, Diffusion, EBMs•5 minutes
- Autoencoder Motivation and Architecture Overview•2 minutes
- Building the Encoder and Decoder•10 minutes
- Transposed Convolution for Decoding•10 minutes
- The Probabilistic Extension: From AE to VAE•5 minutes
- The Reparameterization Trick•10 minutes
- The ELBO Loss Function•1 minute
- KL Divergence and Regularizing the Latent Space•2 minutes
- VAE vs. Autoencoder: Key Differences•2 minutes
- Face Generation and Latent Space Arithmetic•30 minutes
- Interpolating and Morphing in Latent Space•2 minutes
4 assignments•Total 120 minutes
- Assess Your Learning: Introduction to Generative Modeling and Representation Learning•30 minutes
- Assess Your Learning: Autoencoders and Latent Space Exploration•30 minutes
- Assess Your Learning: VAE Probabilistic Framework and Loss•30 minutes
- Assess Your Learning: VAE vs. Autoencoder and Worked Example•30 minutes
Generative Adversarial Networks take a fundamentally different approach to generative modeling: rather than maximizing a likelihood objective, two networks train in competition. You will work through the full GAN toolkit—from Deep Convolutional GANs and training stabilization techniques to Wasserstein distance, gradient penalty, conditional generation, and cycle-consistent domain translation.
What's included
10 readings3 assignments
10 readings•Total 52 minutes
- The Adversarial Framework: Generator and Discriminator•5 minutes
- GAN Types, Applications, and Ethical Considerations•5 minutes
- DCGAN Architecture and Design Principles•10 minutes
- DCGAN Training: Fashion MNIST and Lego Bricks Examples•5 minutes
- GAN Training Instability and Mode Collapse•5 minutes
- Stabilization Techniques: Normalization, Learning Rate, Label Smoothing•2 minutes
- Wasserstein Distance and the WGAN Objective•5 minutes
- Gradient Penalty and WGAN-GP Training Results•5 minutes
- Conditional GAN Architecture and Class Conditioning•5 minutes
- CycleGAN and Unpaired Domain Translation•5 minutes
3 assignments•Total 90 minutes
- Assess Your Learning: What Are GANs and Deep Convolutional GANs•30 minutes
- Assess Your Learning: GAN Training Tips and WGAN-GP•30 minutes
- Assess Your Learning: Conditional GANs and CycleGAN•30 minutes
Introduced in "Attention Is All You Need" (Vaswani et al., 2017), the Transformer is arguably the most consequential architectural development in deep learning since the CNN. You will derive the attention mechanism from first principles—Query, Key, Value, scaled dot-product, multi-head attention—assemble the full architecture with positional encoding and causal masking, and see it applied in a GPT-style language model.
What's included
1 video11 readings3 assignments
1 video•Total 2 minutes
- The Attention Mechanism•2 minutes
11 readings•Total 102 minutes
- Why Transformers? Advantages Over RNNs and CNNs•5 minutes
- GPT Overview and Key Transformer Applications•2 minutes
- What Is Attention? The Concept and Intuition•5 minutes
- Attention Continued•5 minutes
- Self-Attention and Network Parameters•10 minutes
- Multi-Head Attention and Parallel Representation Learning•10 minutes
- Positional Encoding: Sinusoidal and Learned•10 minutes
- Causal Masking for Autoregressive Generation•10 minutes
- Building a GPT-Style Language Model•10 minutes
- Training, Generating, and Evaluating Text•30 minutes
- Congratulations! •5 minutes
3 assignments•Total 90 minutes
- Assess Your Learning: What Is a Transformer and the Attention Mechanism•30 minutes
- Assess Your Learning: Multi-Head Attention and Positional Encoding•30 minutes
- Assess Your Learning: GPT-Style Application•30 minutes
Instructor

Offered by
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
More questions
Financial aid available,



