[MUSIC] As we've discussed earlier, the Cloud computing model depends not only on big data centers, but also Internet connectivity between these data centers. And between the end users, and the data centers. In this lesson we'll focus on one aspect of the later, making cloud services more reliable, and faster for end users using Content Distribution Networks that is CDNs. Let's say you're running a cloud service out of just one data center. This may be good enough for a small population of users located nearby. But what if clients all over the globe were to access this service? This approach provides no opportunity for load balancing and tolerating failures at the site. Further, the far away clients experience very high latencies. If a single event for round-trip time is 300 milliseconds. A rich service making several of these round trips will provide a very poor user experience. And users care about low latencies. Google, for example, has noted that an additional delay of 400 milliseconds under search page load time translates to fewer searches by users. So, 0.4 seconds translates to 0.74% fewer searches. Now, that number might seem small. But for a company the size of Google, that is a sizeable chunk of revenue. Also, this reflects that users consciously or subconsciously care about small latencies. Besides the latency aspect, providing service from one site also means moving huge volumes of data across the globe. Particularly for applications like streaming movies. This can create huge demand for long haul bandwidth, which is expensive. But doesn't caching solve these problems? Let's review how caching works. A client makes a service request. The response may be cached at a cache Middle box that the ISP deploys. If another client requests the same data and it's present in the cache. This client can simply get the data from the cache. There are several reasons this approach does not go far enough. First, the volume and diversity of content fetched together with the long tail distribution of popular content makes caching extremely difficult to benefit from beyond a certain point. But even more crucially significant amounts of content really are dynamic, either changing over time like the news, or personalized based on user interest. The increasing trend of encryption also reduces the use of caches. Data encrypted with session keys specific to users cannot be cached. CDNs address these problems, and as a result, are on track to be carrying a majority of the Internet's traffic soon. Cisco, for example, estimates that 62% of all the traffic over the Internet will cross CDNs by 2019, up from 39% in 2014. So in this lesson we'll examine how this important piece of Internet infrastructure works. How do CDNs work? I'll first provide a brief overview of the pieces that comprise this solution, and then delve into the details. First, the content servers are spread across several locations. These servers actually run a service which means they can serve dynamic data over encrypted connections. Unlike a cache, which is limited to serving static content. Second, the content or service needs to be replicated at these locations. The data needs to begin fresh and these service provider may want to gather analytics. Further, client interactions may require getting data from other CDN nodes or from the origin server as is often the case. All of this requires high capacity networking between the various CDN nodes, and the origin. Now one may simply use Standard Internet Transit through an ISP, Or private connectivity over fiber that one leases or owns. Some of what we discussed in our WAN connectivity lesson applies here. But in this lesson, the detail I provide will be more focused on latency. Lastly, once a service is replicated, how do clients learn where to get the data from? Some mechanism is needed to direct clients to an appropriate server or CDN site. Based on metrics like latency to improve client performance, or perhaps load at different locations to improve load balancing across infrastructure. It's worth mentioning, that while large providers like Google, might operate their own seeding infrastructure. Most web services will depend on a seedian provider, like Akamai, Limelight, or Level Three. Let's look at each of the three aspects we listed here in more depth. So, where does one put the content servers? The placement decision is not necessarily about geographic distance. What about network distance? Which is partly why co-location facilities and Internet exchange points are great choices. An Internet exchange point, RNIXP is essentially a data center, their eyes is connected to each other. This for example is the London internet exchange. It's one of the largest ixp's in the world. It's much like the Heathrow airport of the Internet. Moving around two Terra bites per second on average. Such facilities hold lots of network equipment like this. These are switch racks inside the world's largest ISP, by draft volume. Look at it in Germany. Housing your servers inside such a facility if you can afford it and server space is available, or in a data center hosted by an ISP connected directly to such a facility. Provides good connectivity to every part of the globe. But why would an ISP host CDN infrastructure for event provider, or for a CDN provider. Well, for the ISP this provides two benefits. One, they do not have to carry that much long haul traffic anymore. Because a lot of content will be served from the CDN nodes inside the ISP's own infrastructure. Second, it also improves the quality of service for the ISP's own users. Although, at times I have great difficulty believing that this is a concern for ISPs. Here you're looking at one such CDN location. I pulled these images from an NANOG talk by Dave Temkin at Netflix. NANOG is a network operators group which often gives a cool peak into network operations. Netflix equipment is connected to tens of large facilities, including boats, the London, and Frankfurt fans we just looked at. And the numbers will only increase as the server expands. These servers replicate Netflix content and stream video to end users. Netflix uses custom built boxes optimized to deliver video to clients at small power and space footprints. That said, different applications have different priority. For Netflix, tens of locations that deliver video streams are fine because latency is not the guess determinate of performance. A small startup delay in video streaming is typically deemed acceptable because of buffering. For latency sensitive data application however, you might need a much larger set of locations to be able to reach clients at lower latencies. In fact, Akamai claims to run 170,000 plus servers in over 100 countries inside more than 1,300 networks around the world. As we noted earlier, CDNs move around a lot of data. And need to do it at low latency and high reliability. How does one network these locations to each other? We've already discussed briefly the physical interconnection aspects. But what I want to focus on here is the interesting opportunities for networking this setting provides at the routing and transport layers. At both layers, I'll discuss one technique that's in use in such networks today. Let's start with routing, unlike routing inside data centers. Internet routing is driven by lots of independent players motivated by economics. This can lead to routes which are some optimal in terms of performance. An ISP might pick a next hub, this not on which path has lower latency, but on which path is cheaper. Or sometimes Internet routing can suffer from configuration errors, and faults, and anomalies, which might take a while to be corrected. Here you are seeing a real traceroute from Taipei to Putian in China. The IPs are obscured here so you don't hammer the same addresses with your probes. The end to end latency here is 290 milliseconds round trip, while the shortest distance round trip on Earth's surface between these locations is around 500 kilometers, which would be just two and a half milliseconds at the speed of light and fiber. Sure, fiber isn't laid along the shortest path between these two locations. But that does not explain the hundred times larger latency. Examining this trace reveals that this path runs through San Jose in the US. That's 21,000 kilometers. Now this instance is particularly extreme and probably a transient anomaly. But Internet routes can be suboptimal quite often. Situations of the type on this slide present themselves frequently. Here, the latencies are such that the direct route's latency from a to b, that is lab is larger than the sum of latencies, or an indirect path through x. That is l_ax + l_xb. This is called a triangle inequality violation. If the Internet were doing shortest path routing. You would not see such situations at least at times when the routes aren't transient. The name triangle inequality violation comes from the property of a triangle. Where the sum of the lengths of any two sides, must exceed the length of the third side. So, how do we obtain more control over routing to avoid such scenarios and obtain better performance. The anser is overlay routing. If we have network equipment inside these networks. We can simply have packets at a, destined for b, go to x. And for x to forward them onwards. At its simplest, these could just be servers moving packets via tunneling, as shown here. So packets from a travel to x, x decapsulates the packets, and forwards them on via standard routing To be. Thus even if the axb path is not available over the Internet's BGP routing. Nothing stops us from doing this at a higher layer. In 1999 study, evaluated how much such routing could potentially improve latency. This is based on trace route measurement between end hosts. The measurements were collected over default routes on the Internet and then compared against the best alternate routes that could be stitched together passing through other hosts in a manner similar to what I just described earlier. Now this measurement dates back to 1999 and is somewhat biased to university based clients as well as public trace route servers. Which may not quite be representative of end hosts on the Internet. I wish there was newer and more extensive data available, but we have to make due. For 30% to 35% of the host pairs in this measurement, routes with latency lower than the default past exist. In a significant fraction of these cases, improvements of more than 20 milliseconds are possible. The paper also includes similar analysis for loss rates. While these measurements are old, this technique is used today in Akamai's network. Let's review how one would make this work. One could send active probes that has measurement packets between these locations and collect measurements. Or, just passively monitor the data traffic that is already moving between these devices. And this'll help us keep track of performance metrics across these locations. Let's say this map shows the collected pair-wise measurements. So, the latency between San Francisco, and the US, and this location in Europe is 160 milliseconds as you measure it over the path on the Internet. For scalability, if the network is large, one may not collect all peer revised measurements. But rather, organize things into some hierarchy. One may also use passive measure data to avoid overloading the system with too many measurement probes. With these theorized measurements in hand. One can run a link state protocol on top of this overlay if one wishes to let individual sites make decisions, or collect this information centrally and push out the overlay routes to the CDN nodes. One point of note here is that there might be other considerations than latency or loss. Such as the cost of bandwidth along different paths. [MUSIC]