In the lecture, we got the opportunity to get a brief look at how we can use Python code in order to see all these different visualizations, as well as the summary statistics. Now we will dive into an actual notebook and run this code ourselves. So we're going to be using the iris dataset that we've discussed earlier. Our first step is going to be to import the necessary libraries. OS is just going to allow us to access our operating system. NumPy is going to be the numerical library that we'll be using very often throughout. Then as we've seen, we are importing Pandas as pd, and that's going to be the library that we use for the majority of our data manipulation before actually going into visualizations. So let's import those. Our first question here is to load the data from the file using the techniques that we've already learned. We'd want to determine the following. The number of data points, so how many rows? We have a hint here to use the.shape attribute, which is an attribute of a Pandas DataFrame. The column names, which is, again, going to leverage another attribute of our Pandas DataFrame, which is the.columns. Then finally, the different data types and we're going to use the dot types attribute here. So we bring in the file, we set file path that has a variable, and then the path to that actual file is going to be the data folder with the iris data CSV file. We use pandas.readcsv, in order to bring in that actual file. We're going to use data.head to just look at the first five rows. There we have it the first five rows with our different features: sepal length, sepal width, petal length, petal width, as well as our different species. Now I want to create a cell above here, just to get an idea of what each one of these attributes do individually before printing them all out. If I run data.shape, we get the shape of the entire dataframe. What we care about is the number of rows, so we're just going to select the first value in Python, that's the zero with value, to get the number of rows. We also, just so you know, could have done the length of data, and that will also be a 150. The next attribute that we pull out are the column names. We can do data.columns and that will give us the values. We bring that to list, because this is actually going to be an index type rather than a list type. It may just be a little bit easier to look at and to manipulate. For most intents and purposes, you should be fine if you're just trying to see the column names that just do data.columns. But just to look at it, we can see how we bring this till list and how that looks a little bit different. Then finally, we do data.dtypes. We see for each one of our different features that we had earlier, as well as our target variable, we see that the first four are all floats and the final one is an object, which makes sense because it's not a numerical value. Now we print those all out at once. Question 2 is going to be to examine the species names. Note that as we just saw when we looked at the head over here, when we look at the species, we have the iris before each one of the different species, so iris setosa, iris versicolor, so on and so forth. We want to remove this portion of the name, so the species names are shorter. The iris dash is not adding any extra information. We have a hint that there are multiple ways to do this, but you could use either the string processing method, which is what we use here, or the apply method. Now I'm going to show you both. We're going to first look at data species just to look at that column. We see that as was mentioned, that we have Iris dash and then a name. What we can do is we can actually just eliminate the first five-letter since they're always the same. So I'm going to use the.apply first, which is not in the solution here, and we're going to do lambda x. So our inputs can be x. We want the output to be from the fifth value onwards, starting at zero. If we run that, we see that we have eliminated the iris from the beginning of each one of those words. Then we could set as we do here, data species equal to this. Now what we do here, rather than using the apply function is we use the string processing method. Whenever you're working with strings as the column, you generally have the option to do string and then there's a multitude of different methods that are available. Here we're going to use the replace and we're going to replace Iris dash with a blank, and that'll do the same thing as before. Again, we haven't replaced it yet. This is what it looks like. Then we can run the string.replace, in order to replace the iris with just a blank. Then we set that equal to the data species column, so replacing that column. If we look at the head, we see that the first five rows, now no longer have the iris as a prefix for each one of the species. We're going to stop here, and the next section we'll do Question 3.