So we've just created a lot of simulated data, and what we want to do now is to put it inside of a DataFrame using the DataFrames package, which we already imported at the beginning of this Notebook. So I'm going to call my DataFrame data, so that is my DataFrame object to computer variable name data and we're just going to use the DataFrame function. Now what that allows us to do is just to create a variable name, and then we're going to equate that to a list of values that we do want. Because I named my objects that are created, my simulated data, I gave them very descriptive computer variable names, I'm just going to use exactly the same variable names inside of my DataFrame. So Age and I'm just going to use a capital A there just to make it easy for us to see the difference. So Age is actually going to be the variable with uppercase inside of the DataFrame. If you think about a spreadsheet, that is going to be the column, the name of that column. So in the first row, we usually put all the variable names. So the column headers, we also call them in spreadsheets. So I'm going to use that uppercase Age and I'm going to pass that to the age variable that we created the 100 values, and those 100 values are going to go in a rows all below that age and that age column. Then WCC all in capitals, that's going to hold the 100 WCC elements in my list object there. CRP exactly the same, the 100 CRP values. Treatment exactly the same for our Treatment and result list object. So let's run this code, and now let's just have a look at the dimensions of our data sets. So we want to know how many rows are they and how many columns. We count here we have one, two, thee, four, five variables. So I supposed that it has to be five. I mean, it's got to be. Then we had a 100 values. So really we should see a 100, 5 and that's exactly what we see. The rows, comma, the column values, and those tuples. So a 100 rows and that's exactly what we did. If we do the simulated data, please see that all your variables, your list elements, and your list object have the same number of elements in them. We see 100, 5. The five meaning the columns. So with DataFrames, we can use the head function and that is going to give us the first six rows and we're going to see if everything worked. So there we go. Very nicely printed here inside of JuliaBox inside of our browser here, and let's see what we notice. We see these uppercase column headers that we gave Age, White Cell Count, CRP, Treatment, Result, and now you can see what I mean by let's take in this 100 values and then puts that in the Age column, a 100 values puts in the White Cell Column, etc. Very nicely here, it also tells us what the datatype is. So we see 64-bit integers, 64-bit floats, 64-bit integers, strings, and strings. That helps you to decide what statistical test you can do. You cannot do t-test on strings. You should not do t-test on ordinal categorical variables either. So if we named static, worse, and improve one, two, and three, remember those are not numerical values. They are categorical variables and you should interpret them as such, but it really helps us to decide what statistical test we can use. In this section just very quickly, I want to mention to you how we can split a data frame according to a rule that we create. So have a look at this. I'm going to create two new DataFrames, I'm going to call them data A and data B, and what we're going to do is the following. I'm going to say data and then you see the set of square brackets, the set of outside square brackets highlighted in green at the moment on the screen, and you'll see the little comma there. Square brackets means I'm going to refer to rows, comma, columns. I'm using an address. So what are the rows I'm interested in? Well, I'm saying here please take data again. So use data again and I'm using the symbol version of my column header. So it's colon, Treatment, that's the symbol, and I'm saying dot, equals, equals A. So go down to Treatment column, go through every row, and if the Boolean question is return true or false and it's only going to include the true values, then that is where it finds an A. So the first one will be included, the second row will be included, the third row won't be. That will return a false and be excluded. Comma and a colon there, remember that is just the shorthand for please provide all the columns. So it is first the rows. It's going to go down Treatment to make its decision, where everything is A, that is going to go row by row and make the decision for their column, comma. In the end, please include all the columns for me in this new data A and we'll do the same for data B. In this instance, we only looking for B. Now, I'm going to hit Escape, A, Enter or Return, and that gives me a new cell above where I was. Let's do that again. I'm in this cell and I'm going to say Escape, A, Enter or Return just to bring it back into being able to write code. You see that's just the keyboard shortcuts to add this Escape, B; Enter will give me a new cell below. So let's just look at the head of data A. Let's execute that. We see that down this Treatment column, we're only going to see A's, nothing else. Let's look at the tail and that's going to give us the last six values. Let's do data B, and now we can see we are obviously only going to get B's. So I have created two new DataFrames based on original one and that can be very helpful to me later if I want to start comparing things I have separated my data according to that, and you can do anything you want. Instead of this column, you could use a numerical column. So imagine we used Age there and we wanted people older than 50, well, that would be very easy. We'll just say.greater than 50 as opposed to what we have here. But let's leave it at that and you can very easily manipulate the dataset to your heart's content just so that you can separate things out, split it up so that it makes sense for your analysis.