Overfitting vs. Underfitting: What’s the Difference?

Written by Coursera Staff • Updated on Jan 23, 2026

Are you interested in working with machine learning (ML) models one day? Discover the distinct implications of overfitting and underfitting in ML models.

[Featured Image] A person sits a laptop, taking an online course in which they learn the difference between overfitting vs. underfitting.

Key takeaways

Overfitting occurs when the model fits the training data too closely, while underfitting means the model has not undergone enough training.

High-bias models oversimplify data, and high-variance models over-adapt to data.

Maximizing training time, optimizing model complexity, and minimizing regularization are three techniques you can employ to prevent underfitting.

You can prevent overfitting by increasing training data volume, introducing data augmentation, and halting training when necessary.

Discover what causes overfitting and underfitting, how they differ, and strategies to improve ML model performance. If you’re ready to begin a career in machine learning, enroll in the IBM Machine Learning Professional Certificate, where in as little as three months, you can learn about regression analysis, Python programming, data science, deep learning, and more.

ML overfitting vs. underfitting

Overfitting and underfitting are among the key factors contributing to suboptimal results in machine learning. When an ML model aces the training data and spots patterns and trends but is unsuccessful when presented with new data, overfitting occurs, which means the training data conditioned the model’s algorithm to the point where it can’t analyze fresh data successfully.

In a somewhat different fashion, if the ML model fails to make an accurate prediction while using training data, underfitting occurs, which means the model’s algorithm will be incapable of identifying patterns or trends when analyzing new data.

For instance, consider you’re using a machine learning model for predicting stock prices. Made cognizant of historical stock data and various market indicators, the model learns to identify patterns in stock price variations. When fully trained, the model can analyze current market conditions to make predictions about future stock prices, but if overfitting or underfitting has impacted the model’s algorithm, you must consider these predictions unreliable.

What is overfitting in machine learning?

A model is overfitted if it offers ideal predictions when tested against training data but fails against new, unidentified (validating) data. This scenario is observable when:

The model is highly complex or convoluted.

The model overtrains on a single or specific data set, warping its ability to analyze new data.

The training data contains inapplicable information or noise, which can taint the model’s algorithm.

For example, consider an ML model to train a robot in basketball. You encode the robot with detailed moves, dribbling patterns, and shooting forms, closely imitating the play tactics of LeBron James, a professional basketball player. Consequently, the robot excels at replicating these scripted sequences. However, if your model undergoes overfitting, the robot will falter when faced with novel game scenarios, maybe one in which the team needs a smaller player to beat the defense.

What is underfitting in machine learning?

An underfit model performs poorly both on training and new (validating) data. Underfitting in machine learning occurs when a model is too simplistic to capture or learn the underlying patterns in the training data. Other underlying reasons for underfitting may include:

Scanty or limited training data
Inadequate model training time

Here’s an example. You’re using a weather forecasting model with only one variable, such as temperature, to predict rainfall. Devoid of crucial training factors like humidity, wind speed, or atmospheric pressure, the model will likely erroneously forecast rain due to a mere temperature decline.

Did you know? Detecting overfitting is trickier than spotting underfitting because overfitted models show impressive accuracy on their training data.

How to tell if a model is overfitting or underfitting? Bias and variance in data

Being aware of bias and variance can help you assess the reliability of a machine learning model. Here’s what they mean:

Bias represents how far off, on average, the model's predictions are from the real outcomes. A high bias suggests that the model may be too simplistic, missing out on essential patterns in the data.

Variance, on the other hand, pertains to the fluctuations in a model's behavior when tested on different sections of the training data set. A high variance model can accommodate diverse data sets, but can result in very dissimilar models for each instance. Complex models can exhibit high variance.

A more succinct comparison of bias and variance:

High-bias models oversimplify data.
High variance models over-adapt to data.

High bias and low variance signify underfitting, while low bias and high variance indicate overfitting. Note that bias and variance exhibit an inverse correlation. As you continue training a model, bias decreases while variance grows, so you are attempting to balance bias and variance somewhat. Still, your ML model may function properly even with a higher variance.

Now that you understand the bias-variance trade-off, let's explore the steps to adjust an ML model so that it is neither overfitted nor underfitted.

Read more: Understanding AI Bias

How to prevent overfitting?

Overfitting implies that a model fits the training data too closely. To prevent this problem, consider the following measures: increasing data volume, introducing data augmentation, and halting training.

1. Increase the volume of training data

Using a larger training data set can boost model accuracy by revealing diverse patterns between input and output variables. However, it's crucial that you use accurate and clean data. Doing so will prevent variance from increasing in your model to the point where it can no longer accurately identify patterns and trends in new data.

2. Introduce data augmentation.

You could also use data augmentation to prevent overfitting. Data augmentation tools help tweak training data in minor yet strategic ways. By continually presenting the model with slightly modified versions of the training data, data augmentation discourages your model from latching on to specific patterns or characteristics.

This approach is particularly useful in image classification, where techniques such as flipping (inverting the image) or rotation (turning the image at various angles) ensure the model doesn't fixate on an image's orientation.

3. Halt training when necessary.

Finally, you can stop the training process before a model becomes too focused on minor details or noise in the training data. Achieving this requires careful monitoring and adjustment to get the timing just right. If halted prematurely, the model will fail to capture both the core patterns and the nuances of the data (underfitting).

How to prevent underfitting?

An underfitted model lacks adequate training, resulting in faulty machine learning outcomes. To avert this problem, you can employ three techniques: maximizing training time, optimizing model complexity, and minimizing regularization.

1. Maximize training time.

To avoid underfitting, a sufficiently long training duration allows your model to understand the intricacies of the training data, improving its overall performance. Then again, it's essential to tread carefully. Training a model for an extended period can lead to overtraining, also known as overfitting, where the model becomes too tailored to the training data and performs poorly on new data.

2. Optimize model complexity.

Making your ML model more complex can prevent underfitting. Simple models often misinterpret patterns in training data. While it might seem counterintuitive, adding complexity can improve your model's ability to handle outliers in data. Additionally, by capturing more of the underlying data points, a complex model can make more accurate predictions when presented with new data points. However, striking a balance is essential, as overly complex models can lead to overfitting.

3. Minimize regularization.

Lowering the degree of regularization in your model can prevent underfitting. Regularization reduces a model’s variance by penalizing training input parameters contributing to noise. Remember, low variance is an indicator of underfitting. Dialing back on regularization can help you introduce more complexity to the model, potentially enhancing its training outcomes.

Stay current with trends and job opportunities in machine learning

Keep up with trends and job opportunities in machine learning and other careers by joining Career Chat on LinkedIn. You can also explore these other helpful resources:

Watch on YouTube: Career Spotlight: Machine Learning Engineer

Discover career paths: Machine Learning Career Paths: Explore Roles & Specializations

Learn related terminology: Artificial Intelligence Glossary: Learn AI Vocabulary

Whether you want to develop a new skill, get comfortable with an in-demand technology, or advance your abilities, keep growing with a Coursera Plus subscription. You’ll get access to over 10,000 flexible courses.

Build job-ready skills with Coursera Plus

Start 7-day free trial

Updated on Jan 23, 2026

Written by:

Coursera Staff

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.