Hello and welcome back. In this video, we will begin our discussion of the inference engine. We briefly went over some of the necessary components in an earlier video, but we will be going into depth on these components in this video. Before we begin, let's quickly come back to the workflow diagram to remind ourselves of where the inference engine fits in. The main function of inference engine is to deploy the machine learning model on a compute node of our choice, and the source code is designed to remain the same regardless of the hardware you choose. First and foremost, I want to quickly share the Python API reference page on documentation. We'll be going over various inference engine classes today. If you ever forget something or want to dig in deeper, I recommend taking a look here first. Here, we have the example code that we ended with in the introductory video. We first load IR to IENetwork, create IECore object, select device and generate executable network, and finally run inference. In this video, we will discuss the first three steps in more depth, as well as discuss some implementation details that we skipped over during the workflow overview. The executable network will be covered in detail in the next video. First, we create an IENetwork object. This is where we load the IR files that we created using model optimizer. That step is fairly straightforward and only requires the pastor to file. It is also possible to load models from variables by setting set from buffer to true and input the content instead of the file path. Once created, we have access to the inputs attribute of the IENetwork object. This hold a dictionary of all input layers and information on them. The key of this dictionary is the name of the input layer in the original model, and the value is an instance of IE layer object. For our discussion, all we need to know about IE layer is that it is the Toolkit representations of a layer and it has a shape attribute, which gives you the tuple containing input shape. This dictionary contains all input layers. If you model only has one input, then it is the first and only item in a dictionary. But if you're model has multiple inputs, there will be multiple entries corresponding to them. So you will have to know the names of the input layers to pick out a particular one. Finally, it is possible to reshape the input of the layer here as well. A common use case for this is to change the batch size. Batch size controls how many images to process I want and can significantly affect performance. We will revisit this in a second part of the course. Next, let's discuss the IECore object. The constructor for this class is simple, there is no required arguments. IECore classes where you load any custom layer extensions for inference engine that you need. This extensions are loaded with the ad extensions method of IECore class. The method takes four paths of the extension that you would like to use as well as the device to load extension for. As a side note, even if you did not write your own, you may need to load a CPU customer extension that is provided by a toolkit for some combination of models, CPU version, and toolkit Persian. Then we can use the load network method to load the IENetwork we created earlier. Our inputs for this method or the IENetwork object and the string signifying the device to use. As I'm making this video, there were six supported types of devices; CPU, GPU, FPGA, Myriad, HDDL, and GNA. We will discuss the various devices in the second part of this course. As newer versions of the Toolkit comes out, this list of supported devices may change. The full list of the most recent version on Toolkit can be found in the online documentation. The core also has a usage mode that supports using multiple devices for inference. We will also cover this option in depth in the second part of the tutorial. Finally, one additional note on IECore. IECore was introduced in the entire distribution of OpenVino Toolkit 2019 R2 update. The functionality for IECore was largely handled by a class called IE plug-in before this update. So here you have an older example or viewing an older spore form question, you may see implementations with IE plugin instead of IECore. As of this writing, the most recent Intel distribution of OpenVino Toolkit 2019 R3, that still provide support for a IE plug-in. However, we will only use IECore in this tutorial. Now that we have a network and core ready, let's go over a couple of implementation details needed to run inference that were not discussed in a simple example shown in the introductory video. First is device supported layers. This is separate from custom layers that we discussed for model optimizer. Some devices like FPGA couple layers they can't process due to hardware limitations. So it is a good practice to check to make sure that all the layers in the model can be done using the specified hardware. This is done using the query network method of the core object. This method takes in an IENetwork object in device string and returns a list of all the layers that are supported. So we must check to make sure that all layers in the network are in a supported list. This slide shows a one liner recipe you can follow to run this check. Next is image pre-processing. The inference engine that expects a Numpy array with the shape required by the input layers. When you load an image, whether from a file or from a video stream, the chances are that your image is not in the correct shape for network. Additionally, the ordering of the dimension may also be wrong depending on the model. The toolkit generally prefers NCHW, or image number, colored channel, height, and width order for dimensions. But some models may have NHWC format where color channel comes last. So often, you have to do some pre-processing on the input image to convert it to the expected format. We discussed how to get the input layer shape from IENetwork earlier. Using this information, you can use tools like OpenCV to process the input into the right shape and format. Let's take a look at an example where we have loaded the input image using OpenCV. First, we find what the relevant dimensions and CHW are, then we reshape the input image to the correct height and width. Additionally, OpenCV loads the image says in NHWC format, so we need to reshape it using the transpose function. For more information on Numpy and OpenCV functions, refer to their respective documentations. With the image now pre-processed and the core and network we created earlier, we are ready to run inference. We will cover this in the next video.