Welcome to module three. The goal of this lesson is to become familiar with additional column operation nodes. Nodes like string manipulation, missing values et cetera. We will start with the very well-known Iris Data Set from the UCI Machine Learning Repository. So we're going to utilize the File Reader node, we're going click on configure and put the path. This data set is a multivariate data set introduced by British statistician, and biologist, Richard Fisher. In 1936, the statistician and biologists published a paper titled "The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis." He has collected this data set from his pyramid. The Iris data set describes a number of Iris plants by means of four attributes; sepal length, sepal width, petal length, and petal width. The plants described in the data set belong to three different Iris classes; Iris setosa, Iris versicolor, and Iris virginica. This data set has been used for many years as a standard for classification. The three classes are not linearly separable. Only the two of the three classes can be separated by using a linear threshold on two of the four numeric attributes. For the third class, we need to use something more sophisticated than a linear separation. The separation of the first two classes can be clearly seen by using graphic plots, and we will illustrate that in our visualization lesson. This is the reason why this data set is used so commonly to illustrate different functionalities in data explorations within the machine learning framework. As you can see here, the first four columns have numeric values, and the last column has those four different types of Irises. However as you can see there is no column names that were imported from the UCI Machine Learning Repository as we read this data set N. So we will use this lesson also to export string manipulations, and show how to create new rule-based values from the existing column values. So first, we're going to start with replacing values in columns. So this Iris data set file does not contain any names for the data columns. The File Reader node then assigns to each column a default name, like column zero, column one, column two, three, and four. Besides column four where we can see that this is the Iris class, we need to read into the file called "Iris dot names" to really understand which column parameters represent which one of the numerical attributes. So we can go to the UCI data repository website, and we can read the description about this data set. After reading the description of this Iris data set, we can discover that the five columns are organized as follows; sepal length is centimeters, is the first one, the second one is sepal width in centimeters, the third one is petal length in centimeters, the fourth one which corresponds to petal width in centimeters, and the fifth to the cost value. Additionally, once we will denote this data set, we can see that there are no missing values in this data set. That's the first step as well, to rename the data set columns in order to be able to talk clearly about what we're doing with the data. Nine has a node called column name that is designed exactly for this purpose. If you're not sure exactly where to find this particular node, you can browse around the node repository or you can just search for it. Let's search for the column, Rename node. You can see that he has narrowed down under the manipulation, and database folders. We're not talking about connecting to the database yet, so we're going to ignore the database folder, and we're going to take a look at the manipulation folder. We have two different nodes; column name or the column name with a regular expression. We're going to utilize the column name replacer node. So now that we have found a column name replacer node under the manipulation columns folders, we can connect the File Reader node to this node, and then we can right-click on that node to configure it. We can see that this node will allow us to rename any one of the columns. So let's start with the first one. The column zero we talked about corresponds to the sepal length, and we're going to keep it as a double value. We can double-click on the column one, and we can replace the column one with the sepal width, and we will leave that as a double value, and for the last column, we know it is the class column. That column we will rename as a string. We can now execute this node, and we can take a look at the renamed B type table. You can see that we have a sepal length and width, petal length and petal width. Now that we have configured this node, we can execute it, and you can take a look at the renamed, and the retype table. We can see that we have the sepal length, sepal width, petal length, petal width and the class column. So far, we have created a column rename node, and we have connected it to the file reader node with the Iris data set. We then assigned the new names to the data table columns according to the Iris dot name specification file, and ran the execution command on that node.