I'll begin with a new comment for the Preparing Data Section in the program. The DATA statement names the Profit table that I want to create. For now, I'll just write it to the default Work library. The SET statement names cr.orders as the input table. I'll filter the input rows to read only orders, where the delivery date is on or after the order date. This excludes those rows where the Delivery_Date value is 0 or equivalent to January 1, 1960. Next, I'll fix the case issues with Customer_Country. I'll use the UPCASE function to overwrite the values of Customer_Country with the uppercase equivalent. Remember, we also had some negative values for Quantity. I'll take advantage of IF-THEN logic to check whether Quantity is less than 0, and if it is, then overwrite Quantity with the missing value. That way missing values are ignored if we do any analysis or summarization on the Quantity column. Now with the data issues fixed, let's create new columns to help with our analysis. I would like to examine how much profit we are making per order. I don't have that raw number in the Orders table, but it can be calculated from other columns. I'll write an assignment statement to create a new column, named Profit. The formula will be the difference of Retail_Price, what the customer paid, and Cost_Price, what the company paid, times Quantity, or the number of items ordered. To ensure that Profit is displayed as a currency value, I'll add a FORMAT statement to display Profit with the DOLLAR format. I expect these Profit values to be, at most, in the thousands of dollars. So an overall width of 12 and two decimal places should be sufficient to display all digits and symbols, including the decimal point, comma, and dollar sign. I also want to analyze how long it takes for customers to receive their order. We have columns for Order_Date and Delivery_Date. So the difference between the two will tell us how many days passed between the two events. My assignment statement will create ShipDays days by calculating the difference between Delivery_Date and Order_Date. Let's run the DATA step we have up to this point to see how our new Profit table looks. I'll quickly look at the log and the output data. And so far, so good. I can see missing values for Quantity, where it was negative. All country codes are upper case, and Profit and ShipDays days are calculated. You might be curious why several values for ShipDate are 0. These rows all have Order_Type equal to 1, which means the order was purchased directly in the store, and the customer just walked out with the item. What other modifications do we need to make to prepare our Profit table? Take a look at Customer_Age_Group. I want to create a similar column named Age_Range that includes only the first five characters from each string, which would just exclude the word years. I would also like to create a column named Order_Source, that recodes the values of Order_Type 1, 2, 3, or 99, into more descriptive labels. So let's go back to the program. I'll add assignment statement to create the Age_Range column and take advantage of the substring function. The first argument is the column to read, Customer_Age_Group. The second argument is the position of the first character to read, in this case, it's 1. And the third argument is the number of characters to read. So I'll use 5. I just want to remind you that here in this demonstration, we are reviewing only a couple of functions in this example. But there are many more functions that were covered in the courses and could be on the certification exam. I encourage you to review the Exam Content Guide to see the functions that are listed. Now I'll create Order_Source, by conditionally assigning descriptive labels to each value of Order_Type. After the first IF-THEN statement, notice the following statements all begin with ELSE. This is the most efficient way to approach this problem, because when SAS finds a true condition, it executes the statement following THEN and skips the remaining ELSE-IF statements. Notice we have the first three statements assigning values for Order_Type, 1, 2, or 3. If Order_Type is any other value, then Order_Source is unknown. We will continue to do more work with conditional processing, as we review, not only this course, but also the second SAS programming course. The length of Order_Source will be 6, based on the number of characters in the string Retail. If I run the program, we see that Internet and Unknown are both truncated. To resolve this issue, I need to explicitly define the attributes for the new Order_Source column, rather than allowing SAS just to do it implicitly. I'll add the LENGTH statement after the SET statement, to name the new column, define it as character with a dollar sign, and assign a length of 8, which is enough to store the longest string Internet. Finally, I'll use the DROP statement to eliminate those columns I do not need in the final Profit table. We'll exclude Retail_Price, Cost_Price, Customer_Age_Group, and Order_Type. Let's run the step and take a look at our new Profit table. Looks great. We are ready to analyze our data.