In this video, we will take a look at an easy-to-use, graphical way to build machine learning models and pipelines. SPSS Modeler Flows is a part of Watson Studio, which was inspired by another product, IBM SPSS Modeler. We'll discuss that product in a later unit. Let’s have a look again at the overview of different tool categories. Modeler flows include some data management capabilities, as well as tools for data preparation, visualization, and model building. All flows are created using a drag-and-drop editor and consist of “nodes” of various types, with data “flowing” from one node to the next according to their connections. A sample Modeler flow shown here includes two data source nodes shown in purple on the left; type, aggregate, filter, merge, filler, and partition nodes in the middle; 2 model building nodes shown in pentagons. Once a flow is executed and the models are built, the upside-down pentagon “model nuggets” are created. They can be used to see information about the models and to get predictions for new data. And the three green square nodes on the right provide model evaluation information in the form of tables and charts. You can build your SPSS Modeler flows by dragging different types of nodes from the left, the part of the screen called the “palette,” to the "canvas," the main part of the screen. Each flow starts with one or more data sources located in the “Import” group, and can include some or all other types of nodes. Watson Studio provides some sample flows to help new users. In the Drug Study example shown here, we are using a small artificial data set. The target variable is a categorical field, “Drug,” that has five categories, and there are several predictor variables. This flow creates a new “derived” field by dividing the values of one of the predictors by values of another one, and at the end builds a small neural network model and a decision tree model. When a user clicks the “Run” button on the top panel, denoted by a triangle, the flow is executed and the models build. This is reflected in the new pentagon nodes, called “model nuggets,” that display under each model node. If you click on the three dots in the upper right corner of one of those nodes and select “View Model”, you will see various types of model information. By connecting new data sources to the model nugget, you can get predictions on new data. The first window in the model viewer shows model accuracy and related measures, such as precision and recall. This toy data example enabled us to get perfect accuracy, which is normally not the case with real life data. The Confusion Matrix view shows how model predictions on the training data matched the observed target values. Once again, in this toy example all cases were classified correctly. We can also look at Model Information, which displays a table that tells us more about the details of the model. Feature Importance displays a diagram that indicates the relative predictive strength of various model inputs. Finally, the Network Diagram gives a visual representation of the neural network model we built. On the left is the input layer, with units corresponding to each continuous predictor and each category of the categorical predictors, plus a bias unit that is usually present in each layer of a neural network. In the middle, we see a “hidden layer” with 7 units, or neurons, and a bias unit. On the right is the output layer with 5 units corresponding to the five target categories. Controls on the right and bottom of the diagram enable some interactive exploration of the model. The colors of the connections between units indicate the values of the weights on those connections. We can also look at the decision tree model built using the C5 algorithm. A Model Information table and Feature Importance chart appear as before. Additionally, a Top Decision Rules table is displayed. Decision tree models are popular because they have a special structure that makes it easy to explain predictions or extract decision rules. The tree diagram is also displayed. On the left side of the canvas, we see a part of the model palette that can be used in the flows. At the top are “Auto Classifier” and “Auto Numeric” nodes that can be used for categorical and continuous targets, respectively. Those nodes will build several kinds of models and pick the best one based on a certain criterion. Later, we will talk about the AutoAI feature of Watson Studio; AutoAI takes this capability to the next level by automatically finding not only the best model, but an entire data pipeline, which includes various data transformations. In this video, you've learned how Modeler Flows in Watson Studio can help analysts to create powerful machine learning pipelines using a graphical interface without the need to write any code. This feature was based on IBM SPSS Modeler. Next, after completing a lab to give you hands-on experience with this powerful technology, we will take a look at two other IBM products that can be used for Data Science: IBM SPSS Modeler and IBM SPSS Statistics.