Hello, and welcome back. In this video, we will be looking at the workflow of Intel distribution of Openvino toolkit through a simple inference example. Now, the goal of this video will be to familiarize you with the workflow. I will only do a cursory introduction to each component of the toolkit. Don't worry, we will revisit each component in depth later. Focus on understanding where each component fits in the workflow. Let's start by looking at the main workflow for the toolkit. In the last video, I discussed how the toolkit act as a go-between for various frameworks and their models to various hardware. This process is done in two steps, converting models and running inference. The model conversion step takes the original machine learning models from a framework and converts it to a format called intermediate representation, or IR. As models generally do not change, having this separated from inference means that you need to convert once per new model. In the inference step, the IR files are read and model deployed to a hardware of your choice. Regardless of framework and regardless of hardware, this is the general flow of the deployment. We will now begin using some of the tools in the toolkit. Let's briefly discuss setting it up. Intel distribution of Openvino toolkit is a free download, and you can find the download link in the slides. The installation procedure is generally simple, but depends on the OS you have. There is a link in the slides to documentation on how to install. Once you have the toolkit installed, you need to set up your environment. This step will depend on the OS that you have, and I'll be covering the procedure for Linux in this video. For other OSs, check the toolkit documentation. For your Linux system, there is a utility file called setupvars in the bin directory of your toolkit installation that will do all the necessary configuring. You either have to run this every time you wish to use a toolkit, or you can set up your batch environment to run the script automatically. With the toolkit set up, we move on to the conversion step. This conversion step is done by a tool called model optimizer. Model optimizer is a Python executable and here I have used a default install path of opt intel, that is located at opt intel openvino deployment tools model optimizer. The executable we need is mo.py. There are other executed that begin with mo followed by a framework, but these are deprecated. I have prepared a model that I would like to use for inference and placed it in a directory called models that I created. To convert this model to IR, the only required field is input model. The model conversion generally takes less than a minute. Once complete, the output is written to the current directory. You can also specify where it goes by using the output dir file, that is it. Now we have a IR to use for our inference. The model optimizer generally creates 2-3 files with the information necessary for the inference part. One One to note here is that I did not even have to specify the original framework in this case. The model I have brought happened to be a Caffe model, but it could have been a TensorFlow model and the procedure would have been near identical. The next step in the workflow inference is done through the inference engine library. The inference engine library provides a set of classes for deploying the machine learning model on a compute node of our choice. We will be using Python version of the library, but there is a C++ version as well. In the context of the workflow, this is the step where we load the model from the IR file generated by the model optimizer, and deploy it to a hardware of our choice. We will be creating a Python object representing the model and an object representing the interface to the hardware. First step is to create an IECore object. This object is the main tool you will use to prepare an inference work load for your devices. Next, we need to get the IR model. This is done with the read network method of IECore object. This method takes the path to the bin and XML IR files as argument. Note that the syntax and the workflows are identical regardless of whether the original model was a TensorFlow model, a Caffe model, or some other model. As long as you have IR file versions of the model, you simply need to specify the path to it. Now that we have the interface to the hardware in the form of IECore and the model in the form of IENetwork, we can combine these to create an executable network object. This is the object that handles the actual execution of inference. Executable network is created with load network method of the IECore object, thus the method needs two inputs, the IE network object we created earlier and the string specifying what device or devices to use for the inference. The important point here is that the choice of hardware boils down to just a string input for this method, entering the string CPU will deploy the model to the CPU, and then putting GPU will deploy the model to the GPU. By making this an input argument of your script, you can keep a single source code that can deploy to multiple hardware types. Next, let's run the inference. Executable network has several different methods for running the inference, but the simplest way to use the executable network class is to call the infer method. The input here is the image or the images that we want to run inference on. We'll discuss the inputs of this method in depth in a later video. The important thing for this lesson is that once again, the step is identical regardless of the choice of hardware or framework. Finally, the output of this infer method contains the multi-dimensional array with the numerical output of the network. Remember though, that this output has to be interpreted separately, as this is not part of the toolkit workflow. The workflow that I described in this video, and I toolkit components used for them were for a simple case of taking a pre-trained model and simply running inference. However, the toolkit can help facilitate other tasks in developing computer vision applications with a wide range of components such as DL workbench, post-training optimization toolkit and so on. These other components are outside the scope of this course, but you can find more information on them in the toolkit documentation page. This brings us to the end of the basic workflow introduction. Now, we will begin taking a close look at each component. In the next video, we will be focusing on a mono optimizer.