Hello everyone, welcome to Alibaba Cloud ACA developer certification exam preparation online course. This is chapter 5 application monitoring, debugging and optimization. In this chapter we will focus on the following products, CloudMonitor, tracing analysis, log service and Auto Scaling. At the very beginning we will introduce the relationship between the metrics tracing and logging. Then we introduced this products, one by one starting from cloud monitoring, the log service to the tracing analysis. And then we talk about when you are using the AutoScaling, how can you handle the share data between the different applications? So we are running application to serve your business. You may want to collect the petition data for salary which includes the three major directions you want to collect from. One is the metrics is like when you are monitoring ECS you want to know the CPU usage right now. You want to know one of the disk those are called the metrics you want to collect from. Another thing you want to look into is the tracing capability, which means if you have a complex application or macro service framework based application. When the service is calling another service, you really want to look into the calling chain to found out which one is really causing some problem or we want to look into some performance issue. So tracing, tools or service can help you to solve that. Another angle you want to look into is the logging part logging means you want to look into the history about the behaviors. And what error the application has done in the past time so that even know for some root cause purpose. Who or where and what really caused some failures. So metric, tracing and logging are the three major aspects, if you really want to collect application metrics thoroughly. And Alibaba cloud called the monit is majorly used to collect the metrics. And tracing analysis is used to the application wrong time trace back. And logging service, which we call the simple log service it used to collect the historical locks. Let's look into the first product, CloudMonitor, as you mentioned before. CloudMonitor is a very classic service in Alibaba Cloud. It can be used to monitor a lot of stuff not only for the host but also for the other cloud services. They also set up a special menu for the application. You can monitor application liveness and some other metrics by putting into the different groups. And also CloudMonitor provide the event monitoring, customer monitoring those kind of services. Definitely for whatever you think it's an unusual or you think it's critical. So you can generate some alarm, understand the notification to the different channel so that the operator can receive the message as soon as possible. Now let's look at one example of the CloudMonitor. Which of the following options correctly describe CloudMonitor Custom Monitoring and Custom Events. The key word here is Custom Monitoring and Customer Events. So we can tell from the options here. Two options is mainly talking about the customer events. Another tool is talking about a Custom Monitoring so we may have to pick up one of the critic answer from them. So Custom Event is useful collection of continuous event type data query and alarms. Custom Event is for periodic and continuous collection of time series monitoring data query and alarms. Custom Event, actually you define some event and continuous collect based on the event type and you can generate some alarms. So comparing to the second option. So this one is a correct one. For C and D Custom Monitoring is used for periodic and continuous collection of time series monitoring data, query and alarms. Custom Event is used for collection of non-continuous events. So the keyword is here, right? So for the Custom Monitoring since we are talking to monitoring we might want to monitor them in a continuous way. So I'd like to pick up the option C. So I think the answer is A and C. Now let's look into the application logging by using log service. Here are just list some of the big scenarios log service can serve. Actually and the one stop service for the log data, log service find the four requirement of log management together. Log collection, log streaming, log search and log shape. So you are looking at is the structure of the log service. It is majorly composed of three components LogHub, LogSearch and LogShaper. We can firstly collect log data by using the LogHub from ECS, container mobil, terminal, open source software a lot of different resources. And then you can also use the LogHub to consume the log data which was provided by some real time interface. Then we can analyze the collected the log data by using the LongSearch. It provides para bite scale log analysis capability. We can index query and analyze log data in real time. In addition after analyze, you can also use the LogShaper which is a stable and reliable log shaping function to shape the log data to some more commercial storage service like OSS. Or even transfer the data to the max compute or our ADB to do further analysis. So the several questions we can give for the log services like falling. A developer access logs in the Log Service via the API. The error code returned by the service for therefore which could be the possible reason. The first answer is the log projects does not exist. The second one, the requested data digital signature does not match. Server internal error. Server is busy. Actually, if you don't hit this problem before but based on the error code 404 you really mean something doesn't exist from the server side. So the credit answer is the log project does not exist. Now let's look into application runtime tracking which is majorly done by a product called tracing analysis. Tracing analysis provides a set of tools for you to develop distributed application. These tools include trace mapping and request status and also trace topology also application dependency analysis. You can use these tools to analyze and diagnose the performance bottlenecks in a distributed application architect. And even make your micro service development and there's no stick more efficient. So as the picture shows I just list some major scenarios the TA can serve. The first one is query and diagnostic of distributed traces. So TA can collect all user requests of microservices in the distributed architect and summarize this request to distribute the trace for query and diagnostic. Another scenario is a real time collection of application performance data. So TA can collect all user requests for implication and analyze in real time. Also can do the dynamic discovery of distributed topologies. It can collect, distribute call information from all your distributor mixer service and relevant platforms. Also TA support the different languages including what I just listed here. What you need to do is just to install the client and they will begin to collect information in real time. And another scenario here I want to list is where is downstream integrate scenarios. It's quite like the local shaper can do, the TA provides traces that are immediately useful to Alibaba Cloud log service. Tracing analysis can send the traces to downstream analysis platform including log service and also the mass compute. So in this chart I just listed you the major idea about how to use the tracing analysis. You install the different clients, then they will begin to do the following major scenarios collection like the back tracing matrix collection, topology description. So I also have some demo council screenshot to show what it looks like. The question we want to show and assemble question for the trace analysis is there is a key concept defined in the TA product is called spans, S-P-A-N-S. So if a developer have written of application using a micro service architecture in such an architecture, the client first initiate a request. The request first reached load balancer, then go through an authentication service billing service, then request a resource. Finally a result is return. So it's asking how many expense does such a call chain consists of? So to answer this question, you just simply count the application requests and itself and also all the service it went through including one, two, three, four. So the answer should be five. Okay, now let's look into the last section of this chapter, handle the share data between applications. Talking about the shared data, we have to mention the product called the AutoScaling. AutoScaling is management service that allow users to automatically adjust elastic computing resource according to the business needs and policies. Together with the CloudMonitor based on the different metrics we define we can dynamically aid more ECS or remove you ECS from the existing getting groups. But the problem is for the applications already running in the different ECS. How can they share the data smoothly? Especially when something was drawing or something was removed from the scaling group. How this newly joined the nodes, be fully aware what is going on right now. Like the session information like the shared the information. So here I just list some rule of thumb for you to consider if you want to handle the share data between applications. These are the key items you need to consider about. So definitely try not to store too much shared data locally. Locally means individually in a different ECS you should consider to put them in a central place just like a unified RDS. Then SLB, RDS, those are the entry point and the database. So make sure your Auto-Scaling can be aware of them. So that when the new ECS is added, Auto-Scaling will help you to add them to the SLB and RDS. Another thing you need to pay attention to is, because the session information, especially for the web service, the session information is stored in the server side. So we better not to store it locally again on the ECS. We should using some centralized, fast speed, like the Redis database to share the session information. The last rule of thumb is if the ECS is moving out the whole screen group, you may have some important information you want to collect from the ECS. For example, if you want to collect some of the log information and to see what is going on from the ECS in the previous lecture called time. You may want to enable something called the lifecycle hook in the Auto-Scaling service. The hook will notify some other service to tell them okay now this ECS is going to be moved out or tear down. So would you like to do something there? Being notified service we'll go back to the ECS and do the necessary work in the time out period. So considering all this kind of items when you do the application data sharing, it can be a very good guideline for how you plan your whole architect together with the Auto Scaling. So for the product usage itself, let's see a simple question for the Auto Scaling. So in order to deal with southern sparks in traffic, the company A uses Alibaba Cloud Auto Scaling to set up an alarm trigger task. So the task is growing the Scaling Group when the average memory used utilization exceeds 80%. During the test it was found that the alarm task was not executed successfully. What could be the possible reader? First one, the ECS instances in Scaling Group has not yet installed CloudMonitor monitoring agent. Second one before triggered the alarm task, the number of incidents in the group has reached the maximum number of instance. Option three, the instance types chosen in the Scaling Group are out-of-stock in the region. Number four, the number of instances in the current group exceeds the expected number of instances for the Scaling Group. So if you have been using this product, you know, this one is not a real option because we don't have actually something called expect number of instance definition inn Scaling Group. And the other options are correct because first one maybe because you don't have the agent, you cannot detect the situation of your ECS. Also, if the existing ones already reached the maximum number of definition, they cannot group anymore. The second option is, it could be possible that if you want to create some instance but in that region that specific instance is accidentally out-of-stock. It might also impact your execution of the expansion task. So the correct answer is A, B, C. Okay in this chapter regarding application monitoring debugging optimization, We introduced CloudMonitor for the data matrix collection. We introduced tracing analysis for real time data tracing and root cause analysis. Log service is the one stop log collection management service through the log shaper, log search, log hub. You can collect the different sources into the central place. And lastly we talked about AutoScaling when you have the application with the ECS, scaling in and scaling out. You should consider some of the shared data, including the user information, the session information. You should find them a proper place to store instead to store them locally. Okay, this and the introduction of this chapter and looking forward to seeing you in our final chapter. Thank you.