Overfitting vs. Underfitting: What’s the Difference?

Written by Coursera Staff • Updated on

Are you interested in working with machine learning (ML) models one day? Discover the distinct implications of overfitting and underfitting in ML models.

[Featured Image] A woman sits a laptop, taking an online course in which she learns the difference between overfitting vs. underfitting.

A machine learning model is a meticulously designed algorithm that excels at recognizing patterns or trends in unforeseen data sets. That said, machine learning models aren’t error-free. Overfitting and underfitting are among the key factors contributing to suboptimal results in machine learning. When an ML model aces the training data—spotting patterns and trends—but is unsuccessful when presented with new data, overfitting occurs, which means the training data conditioned the model’s algorithm to the point where it can’t analyze fresh data successfully.

In a somewhat different fashion, if the ML model fails to make an accurate prediction while using training data, underfitting occurs, which means the model’s algorithm will be incapable of identifying patterns or trends when analyzing new data.

For instance, consider you’re using a machine learning model for predicting stock prices. Made cognizant of historical stock data and various market indicators, the model learns to identify patterns in stock price variations. When fully trained, the model can analyze current market conditions to make predictions about future stock prices, but if overfitting or underfitting has impacted the model’s algorithm, you must consider these predictions unreliable. Read on to understand the origin of overfitting and underfitting, their differences, and strategies to improve ML model performance.

What is overfitting?

A model is overfitted if it offers ideal predictions when tested against training data but fails against new, unidentified (validating) data. This scenario is observable when:

  • The model is highly complex or convoluted.

  • The model overtrains on a single or specific data set, warping its ability to analyze new data.

  • The training data contains inapplicable information or noise, which can taint the model’s algorithm.

For example, consider an ML model to train a robot in basketball. You encode the robot with detailed moves, dribbling patterns, and shooting forms, closely imitating the play tactics of LeBron James, a professional basketball player. Consequently, the robot excels in replicating these scripted sequences. However, if your model undergoes overfitting, the robot will falter when faced with novel game scenarios, maybe one in which the team needs a smaller player to beat the defense.

What is underfitting?

An underfit model performs poorly both on training and new (validating) data. Underfitting in machine learning occurs when a model is too simplistic to capture or learn the underlying patterns in the training data. Other underlying reasons for underfitting may include:

  • Scanty or limited training data

  • Inadequate model training time

Here’s an example. You’re using a weather forecasting model with only one variable, such as temperature, to predict rainfall. Devoid of crucial training factors like humidity, wind speed, or atmospheric pressure, the model will likely erroneously forecast rain due to a mere temperature decline.

Did you know? Detecting overfitting is trickier than spotting underfitting because overfitted models show impressive accuracy on their training data.


Indicators of overfitting and underfitting: Bias and variance

Being aware of bias and variance can help you assess the reliability of a machine learning model. Here’s what they mean:

Bias represents how far off, on average, the model's predictions are from the real outcomes. A high bias suggests that the model may be too simplistic, missing out on essential patterns in the data.

Variance, on the other hand, pertains to the fluctuations in a model's behavior when tested on different sections of the training data set. A high variance model can accommodate diverse data sets but can result in very dissimilar models for each instance. Complex models can exhibit high variance.

A more succinct comparison of bias and variance:

  • High-bias models oversimplify data.

  • High variance models over-adapt to data.

High bias and low variance signify underfitting, while low bias and high variance indicate overfitting. Note that bias and variance exhibit an inverse correlation. As you continue training a model, bias decreases while variance grows, so you are attempting to balance bias and variance somewhat. Still, your ML model may function properly even with a higher variance.

Now that you understand the bias-variance trade-off, let's explore the steps to adjust an ML model so that it is neither overfitted nor underfitted.

How to prevent overfitting?

Overfitting implies a model fits the training data too closely, so here are three measures—increasing data volume, introducing data augmentation, and halting training—you can take to prevent this problem.

1. Increase the volume of training data.

Using a larger training data set can boost model accuracy by revealing diverse patterns between input and output variables. However, it's crucial that you use accurate and clean data. Doing so will prevent variance from increasing in your model to the point where it can no longer accurately identify patterns and trends in new data.

2. Introduce data augmentation.

You could also use data augmentation to prevent overfitting. Data augmentation tools help tweak training data in minor yet strategic ways. By continually presenting the model with slightly modified versions of the training data, data augmentation discourages your model from latching on to specific patterns or characteristics.

This approach is particularly useful in image classification, where techniques such as flipping (inverting the image) or rotation (turning the image at various angles) ensure the model doesn't fixate on an image's orientation.

3. Halt training when necessary.

Finally, you can stop the training process before a model becomes too focused on minor details or noise in the training data. Achieving this requires careful monitoring and adjustment to get the timing just right. If halted prematurely, the model will fail to capture both the core patterns and the nuances of the data (underfitting).

How to prevent underfitting?

An underfitted model lacks adequate training, resulting in faulty machine learning outcomes. To avert this problem, you can employ three techniques: maximizing training time, optimizing model complexity, and minimizing regularization.

1. Maximize training time.

To avoid underfitting, a sufficiently long training duration allows your model to understand the intricacies of training data, improving its overall performance. Then again, it's essential to tread carefully. Training a model for an extended period can lead to overtraining, also known as overfitting, where the model becomes too tailored to the training data and performs poorly on new data.

2. Optimize model complexity.

Making your ML model more complex can prevent underfitting. Simple models often misinterpret patterns in training data. While it might seem counterintuitive, adding complexity can improve your model's ability to handle outliers in data. Additionally, by capturing more of the underlying data points, a complex model can make more accurate predictions when presented with new data points. However, striking a balance is essential, as overly complex models can lead to overfitting.

3. Minimize regularization.

Lowering the degree of regularization in your model can prevent underfitting. Regularization reduces a model’s variance by penalizing training input parameters contributing to noise. Remember, low variance is an indicator of underfitting. Dialing back on regularization can help you introduce more complexity to the model, potentially enhancing its training outcomes.

Learn more with Coursera.

Get a head start in machine learning with the Introduction to Machine Learning course, available on Coursera. Offered by Duke University, this course includes practice exercises in which you will implement data science models, gaining actual experience. You will need approximately 21 hours to complete this course.

You may choose to complement the aforementioned course with Mathematics for Machine Learning Specialization, also available on Coursera. Intended for beginners, this Specialization focuses on linear algebra, principal component analysis, multivariate calculus, and more.

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.