In this lesson, I'll talk about message authentication. By the end of the lesson, you should be able to understand how message authentication works, and its use for integrity. You'll be able to discuss what a hash function is and understand how different hash functions can be weaker or stronger than others. Let's go back to the CIA triad, confidentiality, integrity, and availability. Integrity, if you remember, means that data is accurate, complete and unaltered. Integrity is important in case something is altered. So, if I have a transaction from my bank for example, I want to make sure that that is accurate. How can we verify that something is accurate? Can we call and check on the bank or is there a different way that we can verify who is able to help us out with that financial transaction? In CIA triad, encryption protects against passive attacks, this is confidentiality. So, encryption maintains confidentiality. Well let's talk about an example here. So someone actively listening on a network connection. If it's an unencrypted connection, for example, over HTTP we can eavesdrop on that connection and understand the communication that's going on. If we use HTTPS, which is encrypted at both ends, then eavesdropping doesn't allow us to see the message contents or the communication channel. Message authentication deals with active attacks, this maintains integrity. So think about someone pretending to be something that they are not, this is also called spoofing. Message authentication verifies that the received message is authentic, contents have not been altered. We have validated the identity of the sender and we verify the time and sequence could be correct. We can also use, in message authentication, traditional encryption. So the sender and receiver must know the key. However, in message authentication we don't have to use encryption. We could actually just use the authentication portion. In message authentication, we really have three steps. The sender and the receiver must have exchanged public and private key pairs and exchanged their public keys. The second process is the sender generates the message authentication code known as MAC, via a MAC algorithm. And it's appended onto the message. When that message is received, the receiver verifies the MAC by calculating the MAC with the same algorithm that was used by the sender. Let's discuss hash functions. A hash function produces the identity of a file or block of data that is run through the hash function. In order to be able to be considered a hash function, the function must have a few different requirements. Number one, we're able to use the hash function on any size block of data. We are able then to produce a fixed length output string of the data that represents that block of data. We need it to be easy to compute for both the sender and the receiver. And we can also ensure that it cannot be reversed to find the original block of data. Lastly, it has to be collision resistant. There are few algorithms that have had collisions before, and so we need to take these into account when choosing an appropriate hash function to represent our blocked data. There's a few real world examples in order to help you understand how hash functions work. Let's say a user wants to send a message that is signed to insure that the receiver knows that the sender is authentic. This can be done a number of different ways but this is how we use hashing. Password checking is another way that we use hash functions in a real sense. Anytime you login to a computer, specifically let's talk about Windows for a second. Windows uses a hash function every time you login to it. So Windows does not actually store your passwords. It actually stores a hash of the password which is a representation of that password. If we were able to store passwords inside of an operating system and there was a vulnerability discovered within that operating system an attacker might get that password. That's why we store the actual hash of the password instead. So that if they try that password or they try that hash to log into the system, it won't work. Because once we put it through that hashing algorithm, the hash is changed. We also use this in intrusion detection and antivirus signatures as well. Common algorithms and techniques. MD5 is one of the most simple algorithms in existence. It produces a 128bit hash and can have collisions. So we don't use this for highly sensitive signatures. MD5s are typically used to verify that software has been downloaded correctly. So if you see an MD5 also usually accompanied with a SHA or SHA-1 hash, then that is a representation of the file that you just downloaded. If we run it through the hashing algorithm MD5 again, then we should see the same bit sequence. SHA was originally developed back in 1993 from NIST and the NSA. It produces a 160 bit hash but it can also have collisions. Now I didn't talk about collisions when we talked about MD5 but that means that collisions are if we take a certain block of data, let's take a file for example. Let's say an executable, and we'll run that through our hashing algorithm or our MD5. And we take something completely different and run it through that same MD5 hashing algorithm or SHA algorithm. There might be a collision where that both of those hashes match, we don't want that. We want every hash to be independent. SHA-1 was revised in 1995 from the original SHA, for some of the inconsistencies and the vulnerabilities it had. However that one also has been proven to have collisions, so that one has also been deprecated. SHA-256 and SHA-512 were both produced in 2001 and they produce a 256 and 512bit hash respectively. And their block size is going to be 1024, that's the input that we put into it. Each one of these algorithms as we're talking about them goes through a certain number of cycles. So we reduce a very large document or a large file into a very small algorithm, so that 160bit, 256bit, or 512bit hash. Okay, HMAC is another hash function that we have, hash algorithm. This is little bit different than the hashes that we've talked about before where they're actually not sent or appended onto the message. They're sent with the message so that the two are decoupled from each other and they can't be altered. One of the biggest ways that we have hash functions and message authentication functions is seen in X.509 and these are certificates. This is the certificate on web servers. So if we connect to Google, for example, and we go to HTTPS ://google.com there's an encryption algorithm built in to that X509Certificate that says here how I'm going to exchange the information and here is the security of that connection.