Hi from Ottawa, Ontario Canada. I'm Garfield Fisher. I've downloaded the 2005, the 2007, the 2009, the 2011 and the 2013 data sets from the Housing Affordability Data System. Much like you, I've done some data cleaning, I've done some merging and I've gotten rid of some variables that are not that important in my analysis. It occurred to me while doing that that one of the first questions I would get and therefore the question that I'm posing to you is - is there a difference in the current market value for units that are occupied versus units that are not occupied? And I would do that analysis for all five data sets. I would run the appropriate statistic, draw a conclusion and interpret that conclusion and present that conclusion in a way that is understandable to anyone. So that's your question. Good luck and I look forward to seeing what your results are and being part of the process as you learn how to do the test to see if there's a difference between current market value for property units that are occupied and not occupied. Thank you. The question that you'll be addressing here is whether there are some differences in the market values of occupied versus not occupied housing units and whether these differences have a pattern over the period 2005 through 2013. So the primary variables of interest in this analysis become the current market value of the housing unit, which is the value variable in your data files and the status of the unit in terms of occupied or vacant. This is the status variable in the data files. Your analysis is expected to fall under four categories. Firstly, you need to summarize this data for the value and status variables, both in terms of basic descriptive statistics as well as visual summary in terms of graphs. I leave it to you to figure out what statistics and graphs would best represent the data. Secondly, you need to establish whether there is a difference in current market values across occupied and vacant housing units. That is, you need to test for differences in the variable value between occupied and vacant housing units. Whether a housing unit is occupied or not is indicated by the variable status. You need to use appropriate statistical tests. Once again, you will need to decide which statistical tests you need to use. Thirdly, you need to do this analysis separately for each of the five years - 2005, 2007, 2009, 2011 and 2013. Finally, you need to put together your analysis and conclusions in a brief summary report, which includes categories one, two and three that I mentioned. Before you proceed with analysis for this particular question, an important data issue that you need to address is that of missing or so-called suspect data. You will notice that many housing units have a negative value for the current market value or some ridiculously low dollar values. These are perhaps instances where the data was not captured correctly or was not reported accurately. For our analysis, we will delete all those housing units which have a market value of less than $1,000. You could do this deletion multiple ways. However, use of a pivot table simplifies such kind of sub-setting of data.