Next, let's review merging tables with the data step. If we look at the profit table again, notice we have product ID and information about the product line and category. But we don't have a name for the actual item ordered. In another table, products, we have a list of all the products we offer including the name of each product. Merging these two tables together can answer two interesting questions. What is the name of the product ordered for each row and the profit table? Are there any products we offer that haven't been ordered? First, I'd like to review a few key points to remember when merging tables. When merging two or more tables, each table must have at least one common column whose values determine which rows match. In this example, both tables include product ID. When product ID matches in the profit and products tables, SAS combines those rows. Another key point is that each input table must be sorted by the matching column. The data step merge compares rows from the table sequentially to determine matches. So, sorting prior to the data step is required. Let's start by writing some code. First, I'll sort both input tables by product ID. I'll create temporary sorted copies of these tables, profit_sort, and then products_sort. Then, I'll move on to the data step. I want to create two tables. The first is profit_detail, which includes matching rows, or in other words all of the orders from the profit table, and the corresponding detailed product name. The second is product_nosales, which lists all items in the products table that aren't found in the profit table. Or in other words, products that we sell, but haven't been purchased. Next, I'll, add the merge statement to combine the temporary tables created from the proc sort steps, profit_sort and products_sort. After each table, I'm going to add a useful little dataset option that helps determine the matches and non matches. Inside parentheses, I'll add in equal, and create columns named inprof and inprod. Inprof and inprod will be included in the PDB during execution, and are assigned a value of either one or zero for each row. When both columns are equal to one for a particular row, that means the value of product ID was read from both tables, it's a match. If inprof is one and inprod is zero, or vice versa, then product ID was only found in one table, and it's a non-match. If you want to review how these values are assigned during execution, I encourage you to review combining tables in the second programming course. I can use if then logic to write the matches and non-matches to separate tables. If both inprof and inprod are equal to one, then we have a match. Those rows, I want to write to the profit detailed table. If inprof equals zero and inprod equal one, then we have an ID in the products table, that is not in the profit table. These rows are written to the product nosales table. Let's run the data step and see what we get. In the profit detail table, we have all 10,781 orders. Notice that the product name column is included. The rows are still sorted by product name. In the product nosales table, we have 441 rows. Notice all columns from both tables are included. However, only product ID and product name have values. These product ID values didn't exist in the profit table. To clean up this table a bit, I'll go back to the program, and use the keep equal dataset option to keep only product_ID, and product_name. I'll run the code again, and the product nosales table includes only the columns with values.