You've learned more SAS data step techniques for preparing data, and now it's time to apply those techniques in a real-world data preparation problem. In this overview, I'm going to explain the three key pieces of information that you need to begin. First, I'll describe the business problem you are trying to solve, then I'll quickly describe the data you'll be using, and finally, I'll talk about how you can apply your knowledge to prepare global tourism data. In this case study, it's your responsibility to organize and prepare data for your company. After you prepare the data, the analysts and your team use that data to create reports, visualizations, and statistical models designed to grow your company's market share. In your newest project, you have a list of requirements and two data tables. You need to prepare the data for your team so they can analyze inbound and outbound 2014 tourism for countries and continents. Based on your requirements list, you can divide your work into three overall tasks that relate to the tables you need to deliver. The cleaned_tourism table, the final_tourism table, and the nocountryfound table. Let's review each of these tasks and tables at a high level. Your first task is to restructure the tourism table to meet the specific data requirements and create the cleaned_tourism table. You can see the original tourism table is not in a structure that can be analyzed. For example, there is a lot of different types of information in the country column. To be useful for analysis, you need to restructure the table to look like this. The second task is to merge the newly restructured tourism table with the country-info table to create the final tourism table that only contains matching rows. The third task is to create a table name nocountryfound that contains a distinct list of countries that did not have matching rows from the country_info table. The steps you use to solve the case study might vary, but you should end up with these three tables. Now that you have an idea of what you need to accomplish, it's time to explore the data. The main table you'll be working with is a tourism table. This table contains information about the arrivals of non-resident visitors, departures, and tourism expenditure in the country and in other countries. The raw data was downloaded from the UN data website. The second table you'll be working with is the country_info table. This table contains country names and continent IDs. Let's dive a bit deeper into these tables. The tourism table contains information about international tourism. The two main categories of information in this table are inbound and outbound tourism for each country. Let's discuss these categories. Let's look at an example of outbound tourism. We have individuals from India traveling all around the world visiting countries like Italy to see the Coliseum, Russia to see Saint Basil's Cathedral, or to Australia to see the Sydney Opera House. All of these trips are considered outbound tourism from India and the table contains information about departures and tourism expenditure in US dollars. The Taj Mahal is a beautiful attraction in India and considered one of the new seven wonders of the world. This is a very popular destination that travelers from all over the world come to visit. All of these trips are considered inbound tourism to India and the data contains information about arrivals and expenditure in US dollars. In it's original form, the tourism table consists of 23 columns and over 2,400 rows. Let's take a look at a partial image at the table. Here we can see information for the United Kingdom. The A column contains a numeric ID when a country name appears in the country column. The country column contains a variety of information such as country names, tourism type, inbound or outbound, and tourism categories such as the number of arrivals or departures from a country or expenditure in the country or other countries in US dollars. The series column contains the data collection method used by the country. For example, IMF stands for International Monetary Fund. We won't focus much on this column in the case study. Columns _1995 through _2014, contains scaled numeric data stored as text. The country column contains the information you need to properly convert this data to a numeric value. Values are US dollar amounts in millions for rows containing expenditure data and passenger count values for arrivals and departures are in thousands. For example, a scaled value of 21,719 for arrivals in thousands will be calculated by multiplying the number times 1,000 for a value of 21,719,000. The country_info table contains two columns and 250 rows. The continent column contains IDs for each continent. For example, one is North America, two is South America, and so on. Your document will list each value with the corresponding continent. The country column contains the name of each country. You're familiar with the SAS programming process. In this case study, you'll use the new data step skills you've learned to prepare data. Typically, the data preparation stage takes much more time than reporting or analysis. You will hear analysts say that, as much as 80 percent of their time can be spent in this stage. To work through this case study, you will write a SAS program that does the following; Retains values throughout each iteration of the data step, uses SAS functions to perform specific tasks, conditionally creates new columns, creates and apply as a custom format, removes and formats columns, and lastly merges tables. Now that you have an overview of what you'll be doing, the next step is to read the PDF document that gives you all the details. Read all the requirements and then work on this project one step at a time. It's time to get to work.