[MUSIC] With the consolidation of compute and storage in massive data center facilities, we also need to talk about how to provide connectivity over the internet to these facilities. In this lesson we'll discuss end user connectivity to these facilities over the internet. This topic sits at the edge of our course's material. So, we've only scratched the surface here, focusing more on the problem than on the solutions. But it's important enough to deserve an entire course on its own. Different Internet applications plays different requirements on the network. Applications like file sharing and video streaming are not particularly latency-sensitive, but benefit from high bandwidth. For web browsing, the benefits of increasing bandwidth diminish after hitting a few megabits per second, depending on the particular webpage in question. And latency quickly becomes the bottleneck. For several other applications, like multiplayer gaming, remote surgery, remote virtual reality, online music collaboration, qualities depend on extremely low latency, but also high bandwidth. We often hear about the majority of internet traffic being video and how that dominance is only growing. Cisco estimates that 80% of all consumer internet traffic will be video in 2019. Now, given that video streaming can tolerate several seconds of delay, is it all about the bandwidth then? 80% of traffic, that's most of it. Does it all just come down to providing high bandwidth connectivity? Not quite. For one, 80% of traffic is not the same as 80% of value. Value across different applications is really hard to quantify, but there's certainly a lot of high valued entrant businesses whose value does not stream from video streaming. In a lot of these depend heavily on low latency. How much do these depend on low latency? One experiment for putting value on latency comes from Google. The experiment entailed adding artificial delays on the server side to Google web search results for a certain set of select clients. And these were compared against a control group in terms of the volume of searches they made on the Google search engine. The results from that experiment are shown here in this plot. On the x-axis is time, week-by-week, from the start of the experiment. On the y-axis is the reduction in searches for this test group compared to the control group of users for whom no delay was added. The two trend lines here, in red and black, are different delay values added to the search results for these users, so 200 milliseconds for some users and 400 milliseconds for some other users. With 400 milliseconds of delay being added by week 6, users see a drop of about 0.74% in their search volume. This directly translates to revenue for Google and for a company Google's size, this is a significant chunk of value. More importantly, it also illustrates how user experience depends crucially on low latencies. So, just 400 milliseconds translated to a drop in user search traffic. To drive home further, the value of latency and to draw its contrast from bandwidth, let's do a somewhat extreme but fun exercise. The distance from New York to San Francisco is roughly 4,000km one way. At a carrier pigeon's breakneck speed, well, for the pigeon's sake, let's say it's not breakneck. At a carrier pigeon's speed of 80kmph, it would take 50h to cover this distance. Let's strap today's highest capacity USB thumb drive, 1TB, to our pigeon. Now, this translates to a bandwidth of 44Mbps. That's pretty good. That's actually better than the average US consumer bandwidth. In fact, several times better than that. But the round trip latency is about 100 hours. So, clearly this is not acceptable for most Internet applications. So, bandwidth is great, but we also need low latency. It's worth pausing here to consider for a moment that just about a century back, pigeons were among the fastest means of communication. So, we have come a long way since. So, bandwidth does matter. But even for applications like Ultra HD video, we'll only need something like 15Mbps. Also, ISPs are completely highly on bandwidth. Latency in the meanwhile, also matters for many applications. And is quickly on the way to being the bottleneck for these, including the browsing. And latency is typically a harder problem to solve. This has been noted in a variety of contexts. The key is that for bandwidth, let's say you have technology A, and you're getting bandwidth X. You can just replicate technology A, draw a parallel infrastructure of a technology itself and you'll get to X bandwidth. The same is not true for latency. You need some new techniques to get lower latency. Latency is also fundamentally bounded by the speed of light. New York to Delhi round trip, is going to take at least 80 milliseconds no matter how good we make our networking technology. Now, one way of getting around this problem is to use content distribution networks. Content distribution networks work by moving content closer to users. There by side stepping some of the issue of distance. Now, this helps in many scenarios, and another lesson in this course discusses the solution in detail. But not every service is hosted by CDN. And for things like multi player gaming, where users may be separated geographically, a CDN will not help very much. So, we need the network itself, to provide low latency. And this is quite complimentary to CDN. CDN's themselves, will also benefit from a low latency internet. So, let's look at a toy experiment to see what latencies on internet today look like. I used the Linux wget command to fetch this popular website, just the index.html, off the landing page. So, it's a very small amount of data, 20KB need fetched here. This took 430ms. I obscured the URL so you'd all don't hammer their servers. Next I pinged the server hosting this web site, which took 55ms. I also geo-located the server to Portland in the US, which is less than 6,000km from my location in Champaign. This would merely be 19ms at the speed of light in vacuum, c. So in this case, the internet is 23 times lower than it could be. And clearly, for 20KB, you are not bandwidth constrained. It's worth noting that the speed of light in fiber is actually closer to two thirds of c than c. This is just a physical media property of class. But that's not fundamental. In fact, today, there are networks that are operating. There are niche networks in certain markets that do work at c, nearly c. So, we really should be using that c based line as a comparison for how well today's internet is doing. Now, let's broaden this experiment a little bit. Here, I'll show you results from nearly 3 million fetches of popular web pages from clients distributed across the globe. In each case, we are getting only the HTML off the landing page. So, again, just tens of kilobytes of data usually. On the X axis of this plot is latency inflation, compared to the speed of light latency to each web server's location. So, that's the 23x number in the previous example. So, that's what's on the X axis. On the Y axis, is the CDF across the bid fetches. So, across these nearly 3 million bid fetches, the median inflation in latency comes to about 35 times. That is the Internet is about 35 times slower than a hypothetical speed of light internet. Note the logarithmic X axis here. In a significant fraction of the cases, the inflation is over 100 times. Now, this is a large slow down. Where does this latency come from? Let me tell you a bit more about the experimental set up used to gather these measurements, so we can find out where this latency comes from. I use 186 clients which are located at universities across the globe, as part of a global test bit called PlanetLab. From each client I fetched, using the Linux coral command, thousands of webpages. And again, we are only fetching the index.html for the landing page. From cURL, I also gathered data about the time to resolve the DNS, the time to establish the TCP connection, and the time for TCP to transfer the date for that page. Let's look at how latency breaks down across these different components. We'll focus here on the median numbers, although the tales are also interesting. In the median, DNS stroked 7.4 times longer than the c-latency. Note that, for some fraction of pages, around 10%, DNS was faster than c-latency. This is simply because the DNS server may be located closer than the actual web server, and we're using the latter as the base line. The TCP handshake considering only the time between the sending of the SYN from the client and receiving the SYN-ACK back at the client, is 3.4 times inflated at the medium. This can also be thought of as an estimate for the round trip time from the client to the server. The time between sending the ACK, together with the client's get request to receiving the first byte from the server, is labeled here as the request response time, and is 6.6 times inflated in the median. This time includes the server's processing time as well. Now using the TCP handshake as an estimate of the RTD, this means that the servers processing time is quite similar to the RTD. That's about a tenth of the total fetch time in the median. For these fetches, the servers processing time is thus, a fraction of the total page time. Lastly TCP using its multiple round trips moves the data to the client, being 10.2X inflated in the median. Know that the medians of these latency numbers will not add up, that is to be expected. Because all of these are heavy distributions, we cannot expect that a medians will add up. The median total time will be larger than the sum of the medians of these individual components. Given the value of the problem, a lot of ways are being explored to reduce latency on the internet, both at academia and in the industry. One such project is Google's QUIC, which aims to get rid of the TCP handshake overhead, as well as the TLS setup overhead, among other things. Now, how do we get rid of the TCP handshake overhead? To do so, we first need to understand why we use the handshake at all. What's its purpose? And then, perhaps we can achieve the same function in some other way. The handshake helps prevent denial of service attacks of a certain kind. Before the handshake is complete, the server does not expend many resources on the client's request. It requires the client to receive its SYN-ACK and to respond with an ACK before it'll process the GET request. So, the TCP handshake prevents attacks based on address spoofing. So, if a malicious client spoofs an IP address and sends a server a request, the server will not spend resources on that request before requiring this client to acknowledge its ACK. It's SYN-ACK in the TCP Protocol. So, the client has to be able to receive the server's message and send a reply back together with the GET requests. This ensures that spoofing the address will not be successful. So now, how do we get rid of the handshake without breaking this useful property? QUIC's way of getting rid of the TCP handshake involves use of what is called a TCP cookie. Here, the first time a client connects to a server, the interaction is as with TCP protocol. You require a handshake. But in this interaction, the server also gives the client a cryptographic certificate certifying that the client does, in fact, correspond to this address. In future connections to the server, the client can present this cookie and the server will process its requests. So, it can just send this cookie together with its GET request, and the server will process that request and send back data. QUIC is a work in progress, but Google's already testing it live on users. In fact, chances are, if you're using the Chrome browser, you have already used QUIC. There is, of course, lots of other work on improving protocols for lower latency on the internet. But let's look back at the breakdown of latency in the Internet, across these different protocol factors. So again, let me remind you that the numbers and the medians here will not add up. It is tempting to conclude from this breakdown that the TCP transfer time and DNS resolutions are the biggest problems. But it is important to remember, that latency at the lowest layers effects all of these protocols. So here, we're talking about the network's round trip itself. What would the effect be if all the bare-metal latencies on the Internet were speed of light? To get one estimate, we can transform all these numbers by dividing them by 3.4x, which is the inflation in the handshake here. That it's 3.4 times. Doing so, gives us these numbers which are significantly smaller. In fact, the reduction in the total time is smaller than the reduction we would have achieved if we had made any of the other protocol factors optimal. So for example, if TCP transfer time was 1X, you'd not reduced the dwell time by as much as in this scenario. So really, we have to be careful when assigning blame for high latency on the internet. It's both protocols, as well as the physical infrastructure and routing. Now, why are bare-metal latencies on the internet this large? Therefore one, the Internet is not using shortest path routing. The Internet routing protocols depend on policy between independent players. These are economic considerations, it's not just shortest path routing. Further, even if the Internet was using shortest path routing, the shortest part between even pairs of routers is not along the shortest path on the Earth's service. The fiber does not run necessarily along the shortest possible path, because of geographical constraints, lakes, oceans, rights of way and so on. An additional factor, is that the speed of flight in fiber is two thirds the speed of light in vacuum. One of these factors mean is that the bare-metal latencies are significantly larger than what you would see on a hypothetical speed of light increment. One caveat about these measurements. I collected these measurements from nodes that are located at universities across the globe. And these are usually better connected then end user locations. This means that these measurements are not necessary representative of the typical end user performance. One way you can help us correct this, is by running the measurement tools that are package with the programming assignments setup that we've given you. The data you collect will be shared with the class, and it'll be an interesting latency atlas of the internet for all of us to see. The summary here is that both the Internets protocols and its physical infrastructure are responsible for high latencies on the Internet today. Latencies of high value, and progress is being made on all fronts to tackle these problems. Some of the work on the subject is linked on the course web page. [MUSIC]