Data analytics methodologies. In this lesson, we'll describe data analytics methodologies. At several points throughout this course, I've positioned the importance and relevance of standards, and processes, and repeatability is in analytics. Certainly for analytics to achieve their true potential and their true return on investment, being able to standardize these things and repeat them over and over again is critical. That's why things like data analytics methodologies are so widely adopted and so useful in their applications. A methodology is simply a system of methods to determine certain things. Business understanding, your approach, the analytics that you're going to be conducting, the data that's required to conduct those analytics. How are you going to collect that information? The data understanding. Who understands the information? Is it comprehensive? Is it accurate? Is it clean? What are the dimensions? What are the calculations? The data preparation, the wrangling, the cleaning, the concatenating, the building processes to move data from one source to another. The modeling, building PNL models, and budgets, and forecasts, building linear regressions in clustering models. In evaluating all of this, is it meeting the business imperative? Are best practices being employed? Are we efficient? Then deploying those results out, it should also be noted that feedback should be solicited at all phases of the analytics development life cycle. Certainly post-deployment of applications and services, and that feedback should be used to continue to develop and direct the course of your analytics activities. Now there are any number of methodologies out there, and there are certain ones that are industry-specific or application-specific. But for our purposes, I'd like to present the cross-industry standard for data mining or CRISP DM methodology. This methodology is really good at helping us understand and visualize, if you will, the phases of a proper data analytics methodologies. The CRISP DM methodology happens to be used in predictive modeling, though it is certainly relevant and widely adopted in those circles, but I think there are contexts here for it in other analytics at large. Just to note, what I've found in analytics is that one concept, one idea that we find in one type of analytics, whether you're doing financial reporting, or operational analytics, or predictive modeling, there is a common language across all of those areas that should be seen as great comfort to those starting their career in building knowledge in analytics. Because oftentimes a lot of what we learned in one context, maybe using specific language is often relevant and if not exactly the same in another context just used in different language or reference points. Just keep that in mind here and certainly that comes to true when we look at the CRISP DM model and see that it's used for things like Phase 1: Business understanding. Some of this was already mentioned, but when building analytics, solutions and application, it is certainly imperative to understand the technical component. But first and foremost, certainly in business analytics and most of us will probably be working in business analytics. Although there are a whole world of educational analytics and analytics for governmental and non-profit and all kinds of things. But certainly, for business analytics Phase I of any good methodology is understanding the business and the business need. Next Phase II: Understanding the data. Beyond the business imperative, the data is the next priority, because without the data, there is no data analytics. Preparing the data is Phase III. As mentioned before, 60-80 percent of the time, a data analytics professional is preparing data, sourcing data, cleaning data, calculating data, aggregating data, moving data. It's also worth noting that one of the characteristics of a predictive analytics project and many analytics projects in general is they take a lot of time, and a lot of that time is spent with data preparation. It should be noted here that if you are in the fortunate position that you have more time than you think you need when working on an analytics project, you have time to spare. It's often well advised to put that time back into the data prep stage because there's often things that we can do from a data preparation perspective that is the best use and best return on the time that we spend. Then Phase IV is modeling. Building the linear regression models, the decision tree models, the clustering model, employing lots of those advanced mathematics and statistical concepts and approaches that we presented in the previous lesson. Phase V is evaluation. Evaluating the model, evaluating the probabilities it's generating, the forecasts it's offering. Phase VI is deploy the model. Start using it as a mission-critical decision-making tool for your business, for the organization. Think about an engage in the group discussion. Does the need for methodologies resonate with you? I know personally that when it comes to analytics, when it comes to professional pursuits, when it comes to data analytics pursuits, the use of methodologies and standards and workflows as we'll learn about next, really resonate with me. Does it help your approach to analytics? Share with the group.