[MUSIC] Hey there. We're talking about the hierarchy of data. So what does that mean? There are a lot of moving pieces that need to be working for good data science to happen. To analyze the data, you need to have data, you need to trust the data, etc. These assumptions are often not explicitly stated because the data systems with a lot of specificity, there's a layer of abstraction between each of these steps. By the end of this lesson, you'll be able to consider why we've dedicated so much time to checking data quality, creating tables, and hopefully you'll get excited about the final project. Machine learning and AI is at the top. It is certainly the most often romanticized part of data science. But having a well established base requires a non-trivial amount of other data work. So let's start from the bottom. We've got the collection of data. In this case, for this class, I've synthesized the data using a Python script and some random number generators, and I've intentionally hidden some things in there for you to find. In a more realistic case, this data would be coming in from real live users on a daily or continuous basis. Storing the data, well, when I set up this course, I uploaded a bunch of CAC files into mode and I watched a little upload bar on the screen. Realistically, the process of putting the data into the dimension tables and the production device and copying them over to an analytics database is non trivial amount of work, as is setting up the events pipeline. When we layer on top of this, some of the processing and pivoting, like we did for the view items table, we're transforming the data. Another example of a helpful transformation would be creating a clean version of the events table that excludes test usage from employees. Although we don't really have a notion like that in our dataset for this class. You can imagine that that would be useful. Next, we can aggregate and label data. We did this a little bit when we decided to count the users added each day. In the next section, we'll go through an exercise creating meaningful categorical variables at a user level. Finally, we can get to the stage where we're ready to find insight in our data. In the third module of the course, we'll practice project scoping, and in the fourth module, and the final project, we'll be covering A/B testing, which are examples of the stage where you can learn and optimize based on your data.