[MUSIC] Next, let's look at an improvement in the transport layer. In this example, we have an end user who needs to get some data from the CDN, but that data access requires connecting to the origin servers. Now, during the entire transaction, the origin server can be quite expensive if the latency between the user and the origin is large. For example, this spot might have 400 milliseconds of latency. So you're doing the TCP handshake, sending this in, getting the [INAUDIBLE]. And that takes 400 milliseconds. After the connection is established, you'll enter TCP slow start. You'll send your request and the server will send you some packets. After you send the acknowledgments, the server will send you more packets, and so on. As you can see, all of these round trips are quite long. 400 milliseconds each. This is quite inefficient and expensive. Another way of doing this is for the cdn nodes to maintain persistent connections or TCP to each other. In this scenario, the client still does the handshake but only with the nearby cdn node from where the cdn servers take over, use the persistent connection to simply send the request and get the response. So the client tries a DCP handshake with the nearby CDN node. Sends its request to the node. That node uses the persistent DCP connection to make this request to the urgent server or other CDN nodes. Gets the response because it does not need the DCP slow start, the DCP connection is persistent. And even though size is already large. And then forwards that data to the client. Perhaps over slow start, but keep in mind that the round trips here are all smaller. Except for the one transaction between the CDN node and the server, everything else is taking place between the CDN node and the end client. The latency on that connection is much smaller than the 400 milliseconds we saw. So this way of doing it incurs much lower latency. This is an even bigger advantage for connections that are encrypted with SSL. Because that requires additional round strips to set up the encryption. Lastly, let's discuss how clients learn where to get the content from. Here is a map of latencies to Wikimedia three CDN locations in Amsterdam, Ashburn and San Francisco. So two of them are in the US, one in Europe. Each bubble is a glide measurement of latency. A bubble is colored yellow, if the Amsterdam location is the one that has lowest latency for that client. It's colored green, if it's the Ashburn location that has the lowest latency and blue if it's San Francisco. Bubbles corresponding to clients with the lowest latency from any location exceeding 150 milliseconds have a red outline. So for those clients, all three locations exceed 150 milliseconds. Clearly, with all three location of the CDN quite far north of the equator, clients in South America, Australia, Southeast Asia and South Africa suffer in terms of latency. In terms of decisions about where to send clients, what this map says is that just three locations, if you're optimizing only for latency, you want all the yellow colored locations to go to Amsterdam. All the green colored locations to go to Ashburn and all the blue colored locations to go to San Francisco. But keep in mind that there might be other concerns like load balancing and cost of bandwidth. Regardless of the parameters used to make the decision, we are only discussing the policy aspect here. What we haven't gotten to is how these policy decisions would be enforced. What is the mechanism that CDNs use to direct different lines to different locations? Let's talk about that. This starts with the HTML page itself. The client requests the landing page HTML from the origin web server. For a simple web server, the resources will usually be served from the same domain. In this example, bigpicture.jpg comes from abc.com. But for web services using CDNs instead of abc.com, the picture may be served from cdnurl.abc.com. So the URLs need to be rewritten in the HTML itself by the webserver. And this does not have to be a dynamic process. This is the html page that the web server serves to every client. The rest is taken care of by manipulating DNS. Next, the client's browser will attempt to resolve cdn.abc.com. The client's browser's DNS request goes to the ISP's DNS resolver, which further queries the top level domain resolver. The top level resolver's response is another domain name. This is called a CN, that is a canonical name or an alias. The local DNS resolver then reaches the CDN resolver to resolve this name. Now, the CDN resolver can respond with an IP address. For CDN server, close to the local DSP server which made the inquiry. This decision may be based on IP geo location for that resolver. It's also possible that the CDN resolver itself issues a C&M response. Which corresponds to a regional resolver for the CDN closer to the client. But the broad idea here is that the DNS resolution is specific to the client's location. Of course, the resolver may consider other factors like load balancing across locations in it's name resolution. If the CDN site located closest to the client is suffering from high load, the CDN nature is to direct it to the next closest site perhaps. Finally, had the CDN's cluster, a load balancer can send a client's data request to the appropriate server. The name resolution from the CDN's resolver, to an IP address has a very short time to live in the DNS cache. This ensures the different clients get different responses and not the same cache response over a long period of time. Now a question arises. What if the client isn't using their ISP's local resolver? What if they are using some service like openDNS or other public DNS servers? That will mean that the client could be contacting a DNS server that's not close to them. Which will throw off the CDN estimate of where the request is coming from. This can have an adverse impact on the performance. We now discuss a completely different mechanism based on manipulating Internet routing instead of domain name resolution. Just like the stack overflow query, you may also have wondered how some IP addresses like google's eight dot eight dot eight dot eight can be reached at low latency from places that are far from each other. For instance, the latency for this address maybe ten milliseconds over ping from both Sydney, Australia as well as New York. Feel free to try this and discuss in the course forums. If this were one physical server, this would not be possible. Sydney and New York are separated by at least 160 milliseconds even if a fiber line ran directly between them along the shortest possible path. The way this works is by using anycast routing. The key idea of Anycast routing is to have different clients go to different physical servers which use the same address. How is this accomplished? By announcing the same IP prefix in BGP from different locations. If the same BGP prefix is announced from different locations across the internet, different ISPs will end up with different choices for next hops to that prefix. What this ensures is that clients have those ISPs will use that particular announcement to reach a server and these serves are all globally scattered. Now this avoids the problem that DSP approach runs into, that is guessing the client's location, because the guessing is no longer necessary. That BGP prioritizes sharper paths means that different clients will choose CDN locations closer to themselves. This approach does have its own drawbacks though. For one, it's difficult to dynamically pick different locations based on the load observed at different sites or the performance differences if something is changed. Because BGP announcements will take a while to propagate and granular fine control using BGP announcements is rather more difficult than with DNS. Another disadvantage is that route flaps, that is BGP dynamics can cause different packets for the same connection to go to different places. That wraps up our overview of how CDN infrastructure works. It's an important piece of infrastructure and will soon come to move most of the internet's traffic. [MUSIC]