Welcome to lesson one of module five on extended query formulation with SQL. I'm gonna start with an important question about the query formulation process. How do the critical questions support the query formulation process? Module four emphasized examples depicting elements of the select statement and details of the joined operator. Lesson one of module five provides guidelines to extend your problem solving skills so that you can gain confidence on more realistic queries. The objectives in this lesson relate to two problem solving skills. First, you should be able to use the critical questions to analyze problem statements before writing select statements. Second, I want you to be able to check existing select statements for extra tables. Query formulation involves a conversion from an unstructured problem statement into a statement of a database language, such as SQL, as shown in this diagram. In between the problem statement and the database language statement, you convert the problem statement into a database representation. Typically the difficult part is to convert the unstructured problem statement into a database representation with designation of tables, columns, join conditions and other aspects. Problem statements are often ambiguous and incomplete with extraneous details. This conversion involves a detailed knowledge of the tables and relationships, and careful attention to possible ambiguities in the problem statement. After answering these questions, you're ready to convert the database representation into a database language statement. You should consult example statements possibly using the same database to help you. You should have statements for problems that involve joint operations, joints in grouping, and joints with both row and grouping conditions. As you increase your usage of the select statement, this conversion will become easy for most problems. The critical questions provide a structured process to convert an unstructured problem statement into a database representation. I encourage you to use these questions, at first explicitly writing down the answers and then implicitly answering the questions, as your write select statement later on. These questions help you to convert the words of a problem statement into the vocabulary in symbols in the database diagram. For the first question you should match data requirements to columns and tables. You should identify columns that are needed for output and conditions as well as intermediate tables needed to connect other tables. All the associated tables must appear in the from clause. For the second question, most tables are combined by a join operation. You need to identify the matching columns for each join. In most joins, the primary key of a parent table is matched with a foreign key of a related child table. More difficult problems may involve other types of join conditions, as well as combining operations that are not covered in this course. For the third question you should look for computations involving aggregate functions in the problem statement. For example, the problem, list the name and average grade of students contained to the aggregate computation for the average. You should also look for conditions that involve aggregate functions. Such as a requirement to list core software and details containing more than 30 students. Aggregate functions needed in result and/or conditions involving aggregate functions indicate that the problem involves groups of rows rather than just individual rows. A database diagram is an essential aid to help convert a problem statement into a database representation. Especially for the first two questions. For example, if a problem indicates that columns and conditions from the student and offering tables are needed, the enrollment table should also be included because it provides a connection to those two tables. The database diagram clearly shows the lack of a direct connection between the student and offering tables. You can see the indirect connection through the enrollment table. Now let's apply the query formulation process to a number of examples of increasing complexity. Let's answer the questions before reviewing the associated select statement. The first critical question, what tables? The offering table is needed because of a condition on the offering year column. The enrollment table is needed because the problem involves counting enrollments. The student column is not needed because the problem statement does not involve columns in the result or conditions on students. How are the tables combined? The offering enrollment tables must be joined on offer number using the condition with the primary key foreign key of the tables. The third critical question concerning individual rows versus groups of rows. Groups of rows are needed as a result includes the count of enrollment rows phrased as the number of students. The select statement shows the count function in the select clause. The cross product style to combine the enrollment and offering tables in a group by clause on offer number. Note that offer number must be qualified because it is a column in both the offering and enrollment tables. Also note, that the renaming of the computed column using the as keyword. See the Module five examples for an equivalent statement, using the join operator style. Let's execute example one and check the results. The result contains five rows with two columns in each row. Now let's apply the query formulation process to a somewhat more complex problem. Let's answer the questions before reviewing the associated select statement. The first critical question dealing with what tables. The offering table is needed because the result requires the course number column. The student table is needed because of a calculation involving the student GPA column. The enrollment table is needed to connect the student and offering tables. The second critical question, dealing with how tables are combined. The offering and enrollment tables must be joined at offer number, using the condition with a primary key foreign key of the tables. The student enrollment tables must be joined on student number using the condition with the primary key foreign key of these tables. Now the third question involving individual rows versus groups of rows. Groups of rows are needed because of the average GPA result. In the condition on AvgGPA the SELECT statement shows the AVG, your average function, in the SELECT clause. The cross product join style to combine the Enrollment, Offering, and Student tables, a GROUP BY clause on OfferNo and CourseNo. And the having clause with the condition on AVG StdGPA. Note, the offer number must be qualified, because it is a column in both the offering and enrollment tables. You should see the module five examples for an equivalent statement using the join operator style. Let's execute example two and check the results. The result contains two rows with three columns in each row. Both rows have average GPA greater than 3.0. I will finish this lesson with some brief comments about efficiency considerations. The good news is that there is little concern for efficiency with today's intelligent SQL compilers. The major concerns are extra elements in your select statements including extra tables and unnecessary grouping. You should be especially careful not to have extra tables, as query execution performance is most sensitive to the number of tables in the statement. When using the cross product join style to combine tables, you must ensure that no join conditions are missing. A missing join condition will make the results incorrect and require substantial levels of computing resource usage. The problem statement for example three, is identical to the problem statement for problem two. The select statement contains an extra table in the from clause, however. The select statement will execute more slowly because an extra table course must be retrieved. The course table is not needed for two reasons. The course number column can be taken from the offering table. The course table is required for each offering row. I encourage you to execute example three, to demonstrate that it produces the same result as example two. The extra table slows the query execution but it still produces the same result. Let's wrap up lesson one about query formation guidelines. The query formulation process involves the transformation of a loosely structured problem statement into a database representation using the three critical questions. Then you convert the database representation into a select statement using similar examples to help you. An answer to the opening question, you learn that the critical questions are the most important part of the query formulation process. The critical questions provide a structure to dissect a problem narrative into a database representation containing needed columns, tables, join conditions and aggregate functions. You also learned that SQL compilers have high levels of intelligence, so efficiency concerns are limited to extra tables, unneeded grouping, and missing join conditions. You should carefully check for these problems after writing the select statement.