What Is a Confusion Matrix?

Written by Coursera Staff • Updated on

Learn what a confusion matrix is and why professionals across industries value this tool. Plus, discover how to calculate and interpret key performance indicators from your confusion matrix.

[Featured Image] A machine learning engineer studies a confusion matrix on his laptop.

A confusion matrix is a two-by-two matrix that represents the number of correct and incorrect predictions within each category resulting from your classification algorithm. In this article, we will explore the basics of classification in machine learning, how to interpret a confusion matrix, advantages and limitations, and what type of career might use this tool.

What is classification in machine learning?

Classification in machine learning is like sorting things into different groups based on their features. For example, imagine you have a selection of photos that include either cats or dogs. Classification algorithms help the machine learn the differences between cat and dog images based on characteristics like color, size, or shape. In machine learning, you might use this concept for more complex tasks, such as recognizing spam emails, diagnosing diseases from medical images, or categorizing products.

Classification algorithms consider several independent variables before generating the probability of something being in each possible category. For example, let’s say you are trying to provide a medical diagnosis for a patient. In this case, the patient’s characteristics and symptoms might be the independent variables. If the patient is over 60 years old and is experiencing joint pain and stiffness, your classification algorithm might give a high percentage likelihood that the patient has arthritis. 

Depending on your algorithm and potential categories, you might find likelihoods associated with other conditions, such as joint fractures, cancer, or infections.

Detecting errors in classification

After you develop your classification algorithm, you will want to detect how accurate your model is. If the algorithm makes mistakes and labels things incorrectly, they are designated false positives and false negatives. For instance, marking a safe email as spam is a false positive. On the other hand, a false negative occurs when the algorithm fails to identify what it’s supposed to find, such as missing a spam email and letting it into the inbox. Ideally, you want your classification algorithm to have much higher rates of true positives and true negatives than false positives and negatives.

What is a confusion matrix?

A confusion matrix is a convenient way of representing your true positives, true negatives, false positives, and false negatives. Confusion matrices typically receive representation as a 2x2 table:

Predicted negativePredicted positive
Actual negativeAB
Actual positiveCD

In this confusion matrix, you have four cells:

  • “D” or True positives (TP): The model correctly predicts the positive class in these cases. 

  • “A” or True negatives (TN): The model correctly predicts the negative class in these cases. 

  • “B” or False positives (FP): In these instances, the model indicated “positive” but the true value was “negative.” In statistics, you might refer to this as a Type I error.

  • “C” or False negatives (FN): These are the cases in which the model indicated “negative,” but the true value was “positive.” In statistics, you might refer to this as a Type II error.

Types of performance measures 

From the four basic elements of a confusion matrix (true positives, true negatives, false positives, false negatives), you can calculate several key performance metrics:


Accuracy is the overall correctness of the model, calculated as (TP + TN) / (TP + TN + FP + FN).


Precision is the accuracy of positive predictions, calculated as TP / (TP + FP). If your precision is 0.4, then the model is correct in its positive predictions 40 percent of the time.

Sensitivity or recall: 

Sensitivity represents the model’s ability to find all the positive cases. When this measure is high, you will be more likely for the model to identify positive cases. It is also more likely to have false positives. You can calculate this as TP / (TP + FN).


Specificity represents the model’s ability to classify negative instances correctly. This is the inverse of specificity—a higher value of this measure means the model is more likely to classify negative cases correctly. However, the model is also more likely to have false negatives when this measure is high. You can calculate this as TN / (TN + FP) 

F1 score: 

The F1 score, or F-measure, is a value that represents how well a classification algorithm performs. This is calculated as 2 * (Precision * Recall) / (Precision + Recall).

Interpreting a confusion matrix involves more than just looking at the numbers. It’s about understanding the context of the problem you’re solving. If you are developing a screening algorithm for a medical diagnosis, you might want to avoid a false negative. For example, imagine predicting no disease when someone actually has one. Avoiding false negatives in this context would be a top priority. 

In contrast, in email spam detection, a false positive (marking a good email as spam) might be more problematic. By examining the matrix, you can identify if your model needs to be more sensitive (increasing true positives) or more specific (reducing false positives) and adjust your approach accordingly.

Advantages and disadvantages of using a confusion matrix

When choosing to use a confusion matrix, you should consider whether it is appropriate for your type of data and what performance measures are important to you. The primary advantages and limitations you might experience include the following.


By using a confusion matrix with binary data, you can determine several different performance measures. With binary classification, you can determine the model’s accuracy, precision, recall, Mathrews correlation coefficient, ROC, and area under the curve. Each of these measures represents a different aspect of your model’s performance, and you can use these measures to determine how your model needs to be altered. When designing for a specific data type, you might find having control over what your model prioritizes is important.


When using a confusion matrix, you should consider the data type you are using. Confusion matrices can become complex when you have multiclass classification because you have more than two classes to predict. As this number increases, interpreting the accuracy and performance of your confusion matrix becomes increasingly complex. Multiclass classification is also limited to only a few performance measures compared to binary classification. 

Confusion matrices can also look misleading if you have class imbalances. For example, suppose you had a data set with 1,000 values and only three positive values. In that case, your classification system might seem to have a high accuracy by just predicting everything as negative when in reality, it cannot detect positives correctly. 

Professions that use confusion matrices

Many different careers you might choose to explore use confusion matrices when dealing with classification tasks. You might use classification tasks in your primary job role, such as designing fraud detection software, or you might use classification as a tool to help you perform more effectively, such as using an email spam filter. 

As a data scientist, you might use a confusion matrix to understand the accuracy and precision of your models. This type of data science applies to many fields you can specialize in. For example, as an environmental data scientist, you might use a confusion matrix when studying how accurately your classification model detected a genetic variant in your sample. As a data scientist, you can expect to earn between $112,000 and $194,000 per year, including base pay and additional benefits, as of January 2024 [1].

Keep learning on Coursera.

You can continue learning about machine learning and classification with exciting courses on Coursera offered by top universities and leading organizations. The Machine Learning Specialization offered by Stanford University is a great way to build your knowledge at a flexible page over the course of two or more months. This beginner-level Specialization will guide you through concepts such as building machine learning models, training neural networks, and building recommender systems. 

Article sources

  1. Glassdoor. “How much does a Data Scientist make?, https://www.glassdoor.com/Salaries/us-data-scientist-salary-SRCH_IL.0,2_IN1_KO3,17.htm.” Accessed March 20, 2024.

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.