Welcome to Lecture 19 of Networks, Friends, Money and Bytes. Now, we're getting towards the end of the course and I have two more lectures left. Now, this one we're going to talk about something very practical. And we'll try to understand something we have not been able to talk about much in technological networks and that is the overhead associated with managing the internet or wireless networks. The motivating question is why am I only getting, say, three or 4%, of the cellular speed sometimes I read in advertisement and commercials? You know, here's an experiment that you can try right now, for example, if you have any 3G, or 4G access. So, supposed you tried to download a large e-mail attachment, for example, a Powerpoint of, say, ten megabyte. So, you can try it on your 3G data network, and then you stop watch, clock it to see how long it takes. Then, you just divide ten mega times eight, that would be the number of bits by the number of seconds that it took and you calculate the speed of downloading this Powerpoint. And then, you divide that number by the advertised, the 3G speed. I don't know what number you are quoted with in the commercial in your country, but I'd say something around hundreds of megabits per second for 4G and at least in the tens of megabit second range for 3G. And I tried this. I tried the, this one the other day. And I was actually standing very close to our door and in a very a high tech region of the country. And I got only 3.7% of the advertised speed. So, the question is who ate my 96%?. Okay. I paid for it and why am I getting only 3 or 4%?? Before we answer this question, I just want to clarify the terminology a little bit. We talked back in Lecture 1. Okay, that's a long time ago at the very beginning of this course about cellular network standardization's evolution from the first generation to second generation and in most part of the world today, we're talking about 2.5G or 3G. Although, 4G is being rolled out as we speak in, in the next three to five years around the world. So, 2.5G are technologies include those such as EDGE or EVDOs. And 3G have two tracks. The first track is was called UMTS or WCDMA, the other is called a cdma2000 and this is done by an industry group called 3GPP, the system by 3GPP2. And 4G is actually even more confusing terminology. To most people, 4G means something called long term evolution, okay, LTE. And there's something called LTE Lite, LTE Advanced, and different releases of LTE. But there are also others who call advanced versions of 3G technologies such as so called HSPA+ also as 4G. So, at this point the terminology is getting a little confusing and the boundary between 3G and 4G is getting a little blurd because different companies may choose to call different development of the standardization whatever they, they would like to call it. So, whatever 3G, 3.5G, or 4G are talking about you will never get, the best case physical layer speed quoted. Okay, best case physical layer speed. So, it's not that you're being cheated by your carrier, okay? It's just that the terminology they use sometimes in quoting the speed refers to the best case physical layer speed and what you actually experience is the useful throughput in your application. Now, there are two main root causes for the discrepancies between the two. Number one is non ideal network conditions. And this can be associated with things such as the air interface, okay? Or whether radio link or the air. As well as in the wired network which could be the backhaul, which could be the data center on the other end just like Google's data center, or it could be somewhere in between over the public IP network. We'll show a picture of that momentarily. The other is overhead, okay? Just like our lives, overheads take up a big chunk. Sometimes, the overhead may take up more than the actual payload. And overheads come in different formats. One obvious one is there are headers in front of each layers, packets, or frames, or segments. Part of that comes from protocol semantics. We'll see a few and then a few more in advanced material. Part of that is due to control plane and we'll explain this in more detail soon. These are the signalling messages that control and manage the network. And part of that comes from the need to enable advanced technologies. Some enablers of advanced technology, okay? For example, as we mentioned, the advanced material part of the last lecture, something called multiple input, multiple output, MIMO, something called the orthogonal frequency-division multiplexing, OFDM. These two are powerful physical layer innovations using signal processing and communication techniques for wireless. And they are used in both 4G LTE and WiFi 822.11n. And there are a number of overheads associated with enabling these kind of physical layer technologies as well. So, non ideal network conditions together with overheads. Now, you may wonder how much can they take away from my best case physical layer speed? Well, quite a lot. In fact, you should count yourself lucky if you still can get 3.7%.. Often, these two things can add up to eat 99% of the best case physical layer speed. So, when we talk about the speed, the speed that matters to you, we're really referring to not this one, but the useful throughput that you observe in the application layer. In the physical layer, you can try a variety of techniques, for example MIMO, OFDM, to enhance what's called the spectral efficiency that is, how many bits can you send per second, over each hertz of the frequency spectrum? And that translate to some degree into the speed that we care about. But has to go through many layers and many kinds of overheads and non-ideal network conditions. And in the end, by the time it gets to the application that you care about, it may have been reduced substantially. Now, of course, when we say throughput, we mean the number of bits directly used, okay? Monitoring, management, control overhead, they don't count in the useful throughput that we are defining right now. Okay. So, a number of bits that is directly used in your application, for example, the number of this that is actually the YouTube video divided by the time that it takes to obtain these. So, if this numerator becomes smaller, the useful throughput drops. Or if this time that it takes becomes longer, then the useful throughput drops. And we will see both kinds of factors at play. So, first of all, let's look at the interface, okay? So, what kind of interface air interface issues do we face? There are quite a few. For example, just look at a propagation channel, okay? Assuming there's no other users sharing the air with you. You are the only one. Still, you have to face path loss, that is, the signal strength that drops as the distance of propagation increases. You have to face shadowing, that is obstruction of the signal by different objects like buildings. And then, multipath fading, which says, that each signal will bounce off, of many different objects and is collected at a receiver from multiple path, okay? So, here is a tower and you are here. So, just because of the pure distance, there's a path loss. Because of building, there's a shadowing. And then, because of different objects, building and trees, the signals actually bounce off a different path, okay? All three will degrade the kind of signal quality that you can receive. So, especially, if you are standing at the cell edge away from the base station, blocked by many buildings, you're going to receive a lower rate. Then, of course you are not the only one in the cellular network, okay? Unless your in the middle of nowhere and in the middle of the night. And you have, so happens to actually still have access to a cell ,, okay. Most of the time, there are other users and they cause interference, as a result. Interference will reduce the received signal interference ratio and we saw, you can do power control to help with that. But still, it is not perfect. In the end, when the SIR drops low a certain threshold, then you have to lower the bitrate. Because the channel and interference is so much, you have to talk slower in order for the intended recipient to correctly decode or understand you. So that's just the air interface and then there's also the backhaul. We'll see very soon in a slide that backhaul consists of quite a few components as well, okay? Another way to look at backhaul is to say that it consists of obviously links and notes, okay? And the links can introduce non ideal network conditions. For example, congestion, okay? They happen on the links and the resulting queueing delay reuses throughput. Longer delay will increase the denominator in the expression of useful throughput, and therefore, reduces the useful throughput there. There's also propagation delay. You think this can't hurt too much. Well, we'll see in an example towards the end of this lecture that they can also add up to reduce your throughput. And now, of course, there are nodes, okay? There are all kinds of nodes, for example, switches and routers. They can introduce additional delay in their server in their queues at their router interfaces, for example. And then, there are servers. And these servers have their own processing power limitation, okay? Especially, if it is not sized properly. Then when there are many people demanding the same servers to respond, the server simply cannot respond fast enough because of this computational power limitation. This happens a lot with a so-called flash crowd with certain popular website servers. It has really nothing to do with the network per se, but nonetheless it will reduce the useful throughput. So, in summary, non ideal conditions in air interface and in the backhaul. As you can see, there are many places where things can go wrong. And then, there is also overhead. For example, the overhead associated with protocols. The semantics protocols actually require a certain sequence of exchange of controlled signals. And the sequence of message passing takes time. And therefore, adds to the time that, it takes to finish getting the useful bits. Okay. Again, increasing this number. And sometimes, they also take an extra number of bits. And that further reduces the throughput. Let's take a look at a very simple example in TCP. recall that TCP is a connection-oriented transport layer of layer four protocol. Its the dominant one for the connection-oriented transport service. Now, when we establish an end to end connection through a TCP, well, actually I'd go through a handshake procedure. For example, one end host says, connect. So, this is a control packet, okay? I go through a certain distance horizontally represented and I take some time to travel through that distance where time is vertically represented in this space time diagram. And then, the other end at B, will then say, alright I got it, send an acknowledgement back. So, that's a two-way handshake, but TCP actually requires a three-way handshake. Then A sends back to B that I acknowledge your acknowledgment. I hear that you hear me. Why would you want to do that? Well, in this case, then, not only does A knows that B is ready to connect, B also knows that A is knows that B is ready to connect. Now, of course, if you want to be really reliable, you can send yet another acknowledgment back. And this can keep on going. But they say, alright, three-way handshake is a good compromise. So, this three-way handshake doesn't carry any useful bits. It's purely setting up the intern connection. What about at the end of the session? You have to tear down the session. So, A may say, I have nothing more to say to you." Okay, finish. And then B acknowledged that and say, I'm done, too." or that I know that you are done. Then B sends another signal finish that says, I'm done. And then B says, alright, I acknowledge that." This is called a four-way session tear-down. You may wonder, why do I have to do this four-way? First of all, can it just be one way? A says, I'm done. And then, you'll just close it. Well, what if B cannot receive this finish control packet, right? You want to know that B knows that you are done. Then, what about this pair? Well that's because the fact that A has nothing more to say does not mean B has nothing more to say. This is a duplex session. Meaning that A can talk to B, B can talk to A or in other words, it is a bidirectional link. So, A is done doesn't mean B is done. You have to hear that explicity from B. And therefore, another two-way handshake. So, altogether, it got a four-way handshake and three-way session establishment. And if this session is a short session, say, I only have just two packets to send to you, you still have to go through this set of overheads. Now, later in advanced material, we'll see quite a few more overheads, especially on mobility support as well as on a local area network. Getting the correct MAC IP address translation, getting the correct URL of the web, both carry a lot of overhead. And then, in a [UNKNOWN] problem, we also look at an open ended question on security overhead associated with ensuring the confidentiality and integrity of the data. And then, obviously, there are also headers. It's a kind of boring to show you all the details of different fields in the header. It is essential if you want to become a computer engineer in charge of a network and min but that's not the goal of this course. It is sufficed to say that there are many essential fields in the header, for example, if you're talking about, layer three, in the IP packet, the header includes things such as the source IP address, the destination IP address, the version of the IPs, the v4, the v6. All these are essential fields and they carry anywhere from a couple of bits of overhead to say, 32 bits of our overhead. So, you've got tens of bytes of overhead just on the header for one layer. And then, you have to go through all the other layers and all, and their own header, okay? And then there are optional fields, okay? Fields that actually just often left empty. But it is important to have a consistent format just in case they're useful in the future. So, they also add to the overhead in the header. And sometimes, a protocol demands a packet to be fragmented. Meaning, you got too long a packet for a variety of reasons, for example, in case that this is lost so then, you have to re-transmit a lot. So, they say, well, I'll rather say, instead of adding just one header for this long packet, I'm going to have to break this into, say, three packets. Now, each one would need a header. So, packet fragmentation increases the proportion of headers relative to the actual payload, used for information of the packet. And beyond protocol semantics and header, there is also the control plane signalling. What is the control plane? We've been mostly talking about the so-called data plane. The visualizations that there's a plane and there are a lot of pipes channels, okay? End-to-end, for example, and another transmitted to actual data. And then, beneath that, okay, vertically, there's another plane called the control plane. There are also a lot of bits flying around, okay? Hop by hop, end-to-end. But they are not carrying the actual payload at all, okay? Not only it's adding a header to the payload, there's no payload anyway. What you are sending are the control signals. And so, here are two analogy. One analogy is Netflix. I think we briefly mentioned this earlier in some lecture. When you go online, okay. you put som DVDs on your queue. That is the control plane, okay? And it's done over the internet. Then, when you actually get the DVD, that is the data plane. And that is through the, say, US Postal Service. It's a completely different network. Another analogy is the airport flight control, okay? The controller will be calling different people, for example, calling the pilot that is the control plane. And it's done through some kind of telephone system. And then, you may say, that please take off, please, please land. And the actual payload which are, say, cargoes or human beings are the ones that's flying through the data plane. And the data plane is actually done over the air through an airplane. So, clearly, the telephone network and airplane network are very different. So, similarly, in communication networks there's a control plane, there's a data plane. And control plane has many different kinds of channels. If you care to read, say, 3GPP standardization spec, you will see three layers of taxonomy of channels, okay? Not the protocol layer that we've been talking about. And you will see that there are tons, okay, more than 30 different kinds of channels. Quite a bit of them are actually control channels. For example, carrying the control message of what transmits power that you, should be used. It's actually very tricky to size these control channels. If their sized too big, then you are wasting your pipe, right? You are dividing a big chunk of your pipe to send these control signals. And it turns out you don't need that much. But if they are a size too small, then it would take too long to finish transmitting these control signals. And then, that will introduce the delay to your throughput formula and that's not good either. So, you face a very tricky balance between too big and too small control channels. Now, what do these control plane channels do? They actually have many important functionalities usually classified into five groups. One is performance monitoring, okay? Just to make sure, what is the throughput you're getting, for example. One is configuration. Again, in the advanced material, we'll talk about mobility. Support, we will talk about address configuration, okay? Then, there's also billing and charging. A lot of what we talked about in lecture eleven and twelve, smart data pricing, usage pricing, are traditionally done inside the network with these particular servers and software are running the billing and charging services, okay? in fact, there is something called CDR, which is a record of the billing and charging that cost tremendous amount of money to maintain and operate by an ISP. And then, there is also bidding and charging, not facing the consumers, customers, but across and between the ISPs. Then, there is also fault management, okay? To make sure that if there is something that goes wrong, you can quickly detect, contain, and then repair it. For example, if one IP router breaks down, you have to be able to find alternative path to go around it, and that requires monitoring and management. And then, of course, security and privacy, things such as authentications and log ins, all of that. So, as you can see, the amount of functionalities involved here is actually very important. Without these, you can't just have the data plane. So now, let's get back to our answering of question, and try to answer it. The question again is what is the speed, the useful throughput application layer that I can expect? And the answer actually depends on four different factors. So, in other words, you got one questions, one question, but four kinds of answers.