[SOUND] Many times, we must try and fit many pieces of information together before we can make a good decision, and that is much like solving a puzzle. Statistics can help us in solving that puzzle. To do this, we first start with describing the data, trying for a clear, compact, and concrete summary of what we would like to know. Then we move on to producing data, formulating more specific questions and setting out to gather information precisely tailored to answer our question. Finally, we draw conclusion from the data and information you have gathered and determine how certain those conclusions can be. Learning statistic is like learning a foreign language, you must first learn the vocabulary. For this reason, you will start off with a number of terms for you to learn. Data are facts and figures from which conclusions can be drawn. For example, your income, your age, your education level are all examples of data. To collect data, we first have to define any particular characteristics that we are interested in. These will be the variables. For example, we may want to know who is interested in this course. For this study the variables for which we need to collect data could be the gender, age, occupation, education level, and geographical location. Then for each element of the study we will collect data items for the variables of interest and the result will be the data set. Data set is providing information about the subjects of the study. The subject of the study can be anything that we want to understand better. It can be people, such as an example I just used, or it could be events or objects. We need to assign a value to each variable. Variables that are quantitative will always have a numerical value which will represent quantities of some sort. Quantitative variables, also referred to as numerical variables, represent quantities measured with a fixed or standard unit of measure. For instance, age, income, temperature are all numbers which represent a quantity and have a unit. Money, for example, could be in dollars, and temperature is in Fahrenheit. A qualitative data represents a category, so these are known as categorical data. For example, gender is a category, and the data collected about the gender of the study subjects will be a qualitative data. Now this type of a data can be non-numeric. For example, a person's gender can be simply recorded as male or female. Or you can assign a number to a category, which means they can be numeric as well. For example, if you ask someone to rate a service on a scale of 1 to 5, 1 being extremely dissatisfied and 5 being extremely satisfied, then the data collected from the subjects will be a qualitative data, even though it's a numerical value. In this case, the numerical value is representing the level of satisfaction as a category not a quantity. One distinction here is that the numbers don't have a unit of measurement, they're just numbers. Now this type of a data can be nominal or ordinal. A nominative variable is a qualitative variable for which there is no meaningful ordering or ranking of these categories. For example, a person's gender, a person's state of residence are all nominative variables. On the other hand, an ordinal variable is a qualitative variable for which there is a meaningful ordering or ranking of the categories, whether or not the data recorded numerically or non-numerically. Here's an example, a customer can rate their level of satisfaction as excellent, good, average, poor, or unsatisfactory. Here one category is higher than the next one. Therefore, customer level of satisfaction is an ordinal variable having non-numeric measures. On the other hand, if you substitute numbers 4, 3, 2, 1, 0, for ratings excellent through unsatisfactory, then the customer satisfaction is an ordinal variable having numerical measurements. Now let's take a moment and think about these variables and decide whether the variable is quantitative or qualitative. And if it is qualitative, is it nominative or is it ordinal? So what did you think? Here are some answers to you. Hair color is a categorical data and no natural order exists. Thus it's nominative. Number of sibling is a quantitative variable. Highest degree earned is a categorical data and there is a meaningful ordering, high school degree, bachelors, masters, etc. Thus it's ordinal. Annual salary is a quantitative variable. Postal zip code is numeric. For example, UIUC has a 61820 zip code, but it doesn't represent a value. Therefore, it's a categorical data and there is no natural ordering, so it is also nominative. If we want to collect data on every single element of interest, then we have data based on population. For example, if we wanted to know the employment status of every person in Illinois, then we need to speak to every single person residing in Illinois. As a matter of fact, every ten years the US government conducts a census which will collect data on many variables for every single person living in United States at the time that the census is being conducted. However, doing a study on a population is most often too expensive and too time consuming. So instead, we select a subset of population to collect data from. This subset is known as sample. When we are doing a sample study, for example, political polls that predict the outcome of an election are performed by collecting data on a sample of the likely voters. When a summary measure that describes some aspect of dataset is computed using a population data from census, it is called a population parameter. For example, population mean is the average of all values in population. When the same measurement is calculated using sample data, it is called a sample statistic. We use sample statistics for estimating values for the population. So we just learned some very basic but important terms. These terms will be used throughout this course, and more importantly in any field that statistics is being used. So it is important that you know what exactly they mean. We will learn other terms as our course progresses and your knowledge of this subject matter continues to increase.