Hello, and welcome back. In this video, we'll be taking a closer look at the model optimizer. We will go over more details on some of the options you may need when converting models. But before we get into the optimizer, remember that we need a model to start. So let's briefly discuss the model downloader tool from the Intel Distribution of OpenVINO toolkit. This tool is a quality of life utility tool that lets you download publicly available pre-trained models. The downloader is a Python executable. Let's briefly go over a few of the most important flags you need. First, the print_all flag shows you all the available models that you can download. Once you find the model you want to use, use the name flag to download the specific model. Some models have different versions. If you want to download multiple, you can download them separately or use pattern matching to download all. The files are downloaded to the user's home directory by default, but -o flag can be used to download to a different location. And finally, if you need more information on any flags, you can use the -h flag to see the help menu. Now let's convert the downloaded model with the model optimizer. We've covered basic usage in the last video, but let's do a quick recap. The input model specifies the model that you want to use, and the resultant IR is written to the current directory, unless otherwise specified using -o flag. Be careful not to accidentally use --output or --input flag, these are different flags. The optimizer will produce the model with float or FP32 precision. But this can be changed to half precision by using the --data_type flag. Some hardware have specific requirements for the precision level of the model. You can find the reference table for this in the documentation, the link is on the slides. If you are producing both a float and half precision IR files, be careful not to have your second set of IR files overwrite the first. So put the files in a different directory using --output_dir flag, or use a --model_named flag to give it a different name. During the conversion, the model optimizer, true to its name, will apply computational optimizations to the deep learning model when applicable. Now, optimizations have two meanings in the context of machine learning and computing. Optimization can mean improving the accuracy of the model, or it could mean speeding up the application. When I say optimization in this tutorial, I'll be referring to the latter kind, unless otherwise specified. These optimizations are model-level optimizations and are generally hardware agnostic. Discussion of exact optimization will require knowledge of deep learning, so it is beyond the scope of this video. But for those interested, there is a detailed explanation of the optimizations in the Intel Distribution of OpenVINO toolkit documentation. Note that these optimizations are applied by default, but if need be can be disabled using flags. Now, in general, simply specifying the input model is enough for the model optimizer. The model optimizer will grab whatever information it needs from the model files themselves. However, there are some cases where you may need to specify additional flags. The full catalog of options for model optimizer can be found by running the optimizer with a -h flag. We'll now go over some of these situationally required flags and when these are needed. First is the original framework that produced the model. The model optimizer uses the file extensions of the input files to determine the framework. But if the model optimizer's unable to determine the original framework that produced the model, maybe due to a non-standard file name, you may have to manually specify using the --framework flag. The list of frameworks supported by the installed version of toolkit can be found with a -h flag. On the version I have, version 2019 R3, the supported options are Tensorflow, Caffe, MXNet, Cal D, and ONNX. Next is the input shape. With a --input_shape flag, as well as the more generalized --input flag, you can specify the input array shape of the model. This option is for when the input size of the model is variable. For example, some frameworks allow the value of negative one or none for some of the dimensions of the input shape. But model optimizer requires the values to be specified. The shape should be the shape used in the original model. If you are not sure what shape was used, the value can generally be found on the website or repository that the model came from. Alternatively, for some frameworks, it can be found by inspecting the model. Please refer to your framework's documentation on how to do this. Next are the scaling and mean values. In most neural networks, the input images are scaled to within a range, typically negative one to one or zero to one. Additionally, most neural networks require that the mean values, the average RGB values of the input, be subtracted from the images. For some networks, these modifications are built in, there are layers that take care of these. But for others, this processing is not included. For these, you can either pre-process inputs manually in your application, or add it to IR by using the scale flag and means values flag. The final general flag we'll cover is the batch flag, which sets the batch size to be used during the inference. The number you use here will likely affect performance, but may be limited by the circumstances of the application. We'll discuss this flag in more detail in the second course, when we discuss optimization. Finally, in addition to general flags, model optimizer also has framework-specific flags. These, of course, are situational depending on the framework, so we will not go into depth here. The help menu has a full catalog of these as well, so check there for more information. This wraps up our discussion on the general model optimizer usage. In the next video, we will begin the discussion on what to do if your model cannot be converted with the modal optimizer. And that is it, we have successfully run inference using the toolkit. To recap the workflow, the model must first be converted to IR using the model optimizer. Then inference engine takes this model to run inference. Once again, the key takeaway here is that this flow is designed so that the choice of framework or hardware does not alter it. This brings us to the end of the basic workflow introduction. Now, we will begin taking a close look at each component. In the next video, we will be focusing on the model optimizer.