Hello. So far you've learned about a lot of different tools and functions that analysts use to clean up data for analysis. Now, we'll take a step back and talk about some of the really big- picture aspects of clean data. Knowing how to fix specific problems either manually with spreadsheet tools or with functions is extremely valuable. But it's also important to think about how your data has moved between systems, and how it's evolved along its journey to your data analysis project. To do this, data analysts use something called data mapping. Data mapping is the process of matching fields from one database to another. This is very important to the success of data migration, data integration, and lots of other data management activities. As you learned earlier, different systems store data in different ways. For example, the state field in one spreadsheet might show Maryland spelled out, but another spreadsheet store it as M-D. Data mapping helps us note these kinds of differences, so we know when data is moved and combined, it will be compatible. As a quick reminder, compatibility describes how well two or more datasets are able to work together. So the first step to data mapping is identifying what data needs to be moved. This includes the tables and the fields within them. We also need to define the desired format for the data once it reaches its destination. To figure out how this works, let's go back to the merger between our two logistics associations. Starting with the first data field, we'll identify what we need to move both sets of member IDs. And to define the desired format, we'll choose whether to use numbers, like this spreadsheet, or email addresses, like the other spreadsheet. Next comes mapping the data. Depending on the schema and number of primary and foreign keys in a data source, data mapping can be simple or very complex. As a reminder, a schema is a way of describing how something is organized. A primary key references a column in which each value is unique. And a foreign key is a field within a table that is a primary key in another table. For more challenging projects, there's all kinds of data mapping software programs you can use. These data mapping tools will analyze field-by-field how to move data from one place to another. Then they automatically clean, match, inspect, and validate the data. They also create consistent naming conventions, ensuring compatibility when the data is transferred from one source to another. When selecting a software program to map your data, you want to be sure that it supports the file types you're working with, such as Excel, SQL, Tableau, and others. Later on, you'll learn more about selecting the right tool for a particular task. For now, let's practice mapping data manually. First, we need to determine the content of each section to make sure the data ends up in the right place. For example, the data on when memberships expire would be consolidated into a single column. This step makes sure that each piece of information ends up in the most appropriate place in the merged data source. Now, you might remember that some of the data was inconsistent between the two organizations. Like the fact that one uses separate columns for suite, apartment, or unit number, but the other doesn't. This brings us to the next step: transforming the data into a consistent format. This is a great time to use CONCATENATE. As you learned before, CONCATENATE is a function that joins together two or more text strings. Which is what we did earlier with our cosmetics company example. So, we'll insert a new column. And then type =concatenate. Then the two text strings we want to make one. Drag that through the entire column. And now we have the consistency and the new merged association lists of member addresses. Okay, now that everything's compatible, it's time to transfer the data to its destination. There's a lot of different ways to move data from one place to another, including querying, import wizards, and even simple drag-and-drop. Here, you'll learn more about choosing the right method for different situations. All right, here's our merged spreadsheet. It looks good, but we still want to make sure everything was transferred properly. So we'll go into the testing phase of data mapping. For this you inspect a sample piece of data to confirm that it's clean and properly formatted. It's also a smart practice to do spot checks on things, such as the number of nulls. For the test, you can use a lot of the data cleaning tools we discussed previously, such as data validation, conditional formatting, COUNTIF, sorting, and filtering. Finally, once you've determined that the data is clean and compatible, you can start using it for analysis. Data mapping is so important because even one mistake when merging data can ripple throughout an organization, causing the same error to appear again and again. This leads to poor results. On the other hand, data mapping can save the day by giving you a clear roadmap you can follow to make sure your data arrive safely at its destination. And that's why you've just learned how to do it!