Hi everyone, welcome to the 4th chapter in our Tencent Cloud Solutions Architect professional course, building high availability architecture. At the end of this chapter, you'll understand the principles of designing a high availability architecture and Tencent Cloud, and understand how to build a high availability architecture for the outer layer network. The access layer, the application layer, the middleware layer, and the data layer in Tencent Cloud. In this chapter we'll cover six sections, overview of in cloud, high availability, building high availability architecture for the outer layer network, building high availability architecture for the access layer. Building high availability architecture for the application layer, building high availability architecture for the middleware layer, and building high availability architecture for the data layer. This video will cover the first section, overview of in cloud high availability, subsequent videos will cover the remaining five sections. Okay, let's get started with section one, overview of in Cloud high availability. In this video, we'll cover the challenges faced by high availability and designing high availability architecture. The objectives of high availability are to reduce the amount of downtime per year, while ensuring the availability of the system. High availability is measured using a percentage with a 100% system indicating a service that experiences zero downtime and never fails. Most services fall between 99% and 100% up time. A 99% system availability is considered basically available, and a 99.9% system availability is considered available. Highly available architecture possesses a system availability of 99.99%, which translates to 52 minutes of downtime per year. A 99.999% system availability is considered extremely highly available, which is equal to less than six minutes of downtime per year. The various challenges faced by high availability, include downtime and service failure. On May 27th, 2015, Alipay experienced a down time of four hours, which was reported to be caused by damage to the local optical cable in Hangzhou. On May 28th 2015, Ctrip.com was down for 12 hours, which was reported to be caused by an engineer's accidental deletion of the runtime environment. On February 28th 2017, the AWS S3 service was down for approximately two hours, affecting tens of thousands of online services such as Netflix, Airbnb Slack and Spotify. Which was reported to be caused by a high error rate in the service, due to a failure in the data center in Virginia. On June 27th 2018, the Alibaba cloud control system and some services such as MQ, NAS, and OSS, were down for 30 minutes. Which was reported to be caused by a failure in the linkage between the services, due to faulty operations during routine operations. A high availability disaster recovery architecture is essential for rapid, stable and sustainable business growth. A breakdown of incidents and problems by category is shown in the following pie chart. Internal problems include code problems, configuration problems, and performance problems. Provider issues include jitter within tunnels or leased lines, ISP failures, and overall data center failures. External threats such as attacks or hijacking, may also cause issues for the system. The overall planning for high availability requires the consideration of the following questions. What level of high availability should the system achieve? What factors are related to the development of a high availability policy? How do we implement solutions for the disaster recovery of data, applications, and data centers, as well as active-active solutions? How do we plan an emergency fail over drill for high availability disaster recovery? Let's look at the recommended disaster recovery solutions for customers at different stages. During the startup stage, the recommended disaster recovery solutions include data disaster recovery, environment disaster recovery, and security protection. During the growth stage, the recommended disaster recovery solutions include data and environment disaster recovery, security protection, multiple clusters and the same data center, and multiple data centers in the same region. During the stability stage, the recommended disaster recovery solutions include data and environment disaster recovery, security protection, multiple clusters in the same data center. Multiple data centers in the same region, and multiple data centers in different regions. The overall planning consists of a layered design, which decoupled the access layer from the service layer, and decouples the service layer from the data layer. It implements high cohesion and low coupling between layers, and supports flexible scalability at different layers. Additionally, the layered design provides support for redundant disaster recovery deployment at different layers, and the set based layer deployment. The abstract layered structure of the Internet is shown in the following diagram. It is comprised of the client, perimeter network, business architecture layer, and basic environment layer. An analysis of the different layers, including the DNS and CDN layer, security layer, access layer, service layer, middleware layer, and data layer, is provided in the following chart. It covers some of the possible problems and risks, requirements for high availability disaster recovery and solutions.