What Are Diffusion Models?

Written by Coursera Staff • Updated on Apr 4, 2024

Explore diffusion models, a type of generative artificial intelligence model that continually finds new applications across professional fields. Learn what these are, common applications, and some advantages and limitations of this approach.

[Featured Image] Two machine learning professionals look over some paperwork in an office and talk about diffusion models.

Diffusion models are a class of generative models in artificial intelligence that have revolutionized how we create and manipulate digital content, such as generating images and audio. At their core, diffusion models add random noise to existing data and then reverse the process to transform the random noise into a structured output gradually. Through this process, the model learns to create synthetic data.

Use this guide to explore different applications of this type of model, innovations in this space, the advantages and disadvantages of this process, and more.

Applications of diffusion models

Diffusion models have found their way into several types of applications, transforming how we create and interact with digital content. While new applications continue to emerge, you might see this technology used for functions such as:

Media generation: Diffusion models are widely used to generate complex data that mimics the structure of training inputs. Professionals can apply this technology in many ways, including generating artificial pictures and synthetic biological structures.

Text-to-image generation: These models can take written descriptors, such as “small dog” or “woman eating an apple,” and create lifelike pictures that capture the textual information.

Large language models: The de-noising algorithms in diffusion models are useful in large language models to understand and interpret complex user text input and produce appropriate responses.

New innovations with diffusion models

Diffusion models have been commonly used to generate images from text. Still, recent innovations have expanded their use in deep-learning and generative AI for applications like developing drugs, using natural language processing to create more complex images and predicting human choices based on eye tracking. One of the most notable creations in this space is the DALL-E, which is an image-generation artificial intelligence model that bases its algorithm on diffusion model principles.

DALL-E, named after the artist Salvador Dali and the robot WALL-E, is a powerful generative AI model developed by OpenAI that can create novel images from textual descriptions, even outside of training images. For example, you could ask it to create an image of “a rainbow stream with unicorns drinking from it” or “a sparkling elephant with two heads.” This is relatively new in artificial intelligence, and researchers are still finding novel ways to use this technology and make it accessible to users.

Who benefits from diffusion models?

Diffusion models have been a revolution in the AI world, with their applications benefitting various fields like medicine, art, marketing, and psychology. While applications continue to expand, you’ve likely already seen these models in play.

For example, in health care, diffusion models enable more accurate medical imaging, such as denoising images and increasing their quality, which can aid in early diagnosis and treatment planning. In addition, their application in drug discovery accelerates the development of new medications by predicting molecular structures and interactions, potentially saving lives by bringing treatments to market faster.

In the creative space, professionals can use diffusion models as tools for creativity and innovation. Artists and designers use them to create intricate digital artworks, interior design mockups, and sound generation, opening new options for artistic expression. In psychology and marketing, professionals use diffusion models to understand neural networks, cognitive processes, social media activity, and consumer adoption of new products. By analyzing patterns and predicting choices, these models help marketers tailor strategies designed for their target audiences, considering product demand and consumer characteristics.

Pros and cons of using diffusion models

Diffusion models are a powerful tool, but as with any type of artificial intelligence model, they have their own set of limitations. Awareness of the advantages and disadvantages can help inform your decisions when designing your model and help you avoid pitfalls. Plus, you can increase your confidence that you are using your model for the right types of data and applications.

Pros

Strategic insights: Diffusion models offer insights into product adoption rates and the spread of innovation. This helps organizations refine their market strategies, identify influential stakeholders, and improve product development processes.

Behavioral understanding: Diffusion models help decode complex human behaviors and choices, which can give marketers and psychologists a deeper understanding of why people make the decisions they do.

Novel images: While more traditional models took training data and tried to create new pictures that were similar to the original input data, more advanced models can now extend applications beyond training data to truly unique outputs.

Cons

Difficulty with complex prompts: Models may struggle with inputs that have numerical or spatial components.

May have limited scope: Depending on the design of your algorithm, the diffusion model may have limits to the patterns it can identify and the types of images it can generate.

Privacy concerns with training data: Because of the high volume of data needed for training, you might find obstacles when sourcing data that isn’t protected, licensed, or copyrighted online.

How to start learning diffusion models

Before exploring the complexities of diffusion models, you’ll want to build a strong foundation in the basics of artificial intelligence. This includes understanding how AI systems learn from data, make decisions, and solve real-world problems.

Since diffusion models are particularly powerful in generating and manipulating images, you might start with a thorough understanding of image processing. To do this, familiarize yourself with concepts such as computer vision, image recognition, and image restoration, which should help deepen your understanding of how diffusion models can create detailed and realistic images from noise.

Diffusion models fall under the category of generative AI, which focuses on creating new data instances that resemble the training data. To expand your knowledge, you can explore other generative models like generative adversarial networks (GANs) to learn about the broader context and capabilities of this type of AI technology.

Get started with Coursera.

Learn more about artificial intelligence algorithms by taking specialized courses and Specializations on Coursera. You can prepare yourself for a career in AI with the Machine Learning Specialization from Stanford University and DeepLearning.AI. This is a beginner-level Specialization that guides you through machine learning basics all the way to applied practice with unsupervised models.

Keep reading

Updated on Apr 4, 2024

Written by:

Coursera Staff

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.