Hello, my name is Nikhil Venkatesh. Welcome to Module 5 OpenCL on Intel FPGAs. In the previous modules we covered writing and compiling kernels. By the end of this module you will be able to use the Intel SDK for OpenCL to compile OpenCL kernels to target an FPGA. You will be able to describe the tools in the Intel software development kit for OpenCL used to analyze the results of OpenCL compilation, and you will also be able to debug an FPGA OpenCL kernel for functionality and performance. Here's the class agenda of what we have covered so far and where we are in the class in our final Module 5. So Lesson 1 of Module 5 the topics we will cover in this lesson are the OpenCL SDK, host and kernel compilation, AOCL utility, and Runtime tools. Intel provides the complete solution for implementing and running OpenCL kernels on an FPGA. The developer writes two pieces of code that will work together; the host code which runs on a standard CPU such as an X86 or ARM, and the kernel code which runs on the FPGA. The host code is compiled using any standard C compiler linking to the Intel FPGA OpenCL libraries, and the host code tends to be the vast majority of the lines of code, but is usually not as performance critical as the kernel code. The Intel FPGA SDK for OpenCL offline compiler builds the kernel structure, data paths, and memory structures for the FPGA. The SDK has several components to help you build your FPGA images. On the kernel side Intel provides the offline kernel compiler whose functionality we've already covered. On the host side Intel provides the OpenCL host platform and runtime API to be used by your OpenCL host application. It consists of statically and dynamically linked libraries. When compiling your host code using one of the standard C compilers you'll need to link to these libraries. The SDK also has several other components that allows you to develop and run your application. The AOCL utility is an executable that can perform many common tasks related to the board, the drivers, and the compile process. We'll examine this tool in more detail later. Intel Code Builder is a plug-in for Visual Studio or Eclipse IDEs which make your coding and development easier. For the requirements on the software side to use the tool you will need either a 64-bit Windows or Enterprise Linux host with the Intel FPGA SDK for OpenCL installed. You will also need to download and install the appropriate device files for your board. You will also need a C compiler such as Microsoft Visual Studio or GCC to compile your host code. These compilers will compile or cross-compile your host program targeting the appropriate platform and linking to the libraries provided in the Intel FPGA SDK as well as the board vendors Libraries. For all platforms except a system on chip host you will need to generate 64-bit binaries. For the SoC based host 32-bit compilers are necessary. If you would like to use an IDE with the Code Builder plug-in to develop your code, you will need either Visual Studio or Eclipse installed. The contents of the Intel FPGA SDK for OpenCL are listed in this table. By default all of the files installed in the Intel FPGA quarturs prime, HLD high-level design, sub-directory where actually stands for high-level design as I mentioned. Within the HLD directory the bin folder is the binary folder containing the main offline compiler and the utility tools and executables. The bin directory contains the runtime dynamically linked to library. This should be included in your search path. The board directory is where all the design files related to a specific board are located. If you install a BSP, a board support package from a board vendor the board platform files will be located in this directory. The IP directory contains the IP cores required for the kernel compilation. These will be stitched together during the compile process to create your custom Dataflow circuit. The host of the various host folders include the operating system specific files used during host program compilation. Now, let's review compiling kernels to make our discussion of the software development kit complete. On the kernel side when compiling make sure all kernels targeting one board are combined into a single top-level source. Then run AOC along with the board options to generate an AOCX file. Once the AOCX file is generated makes sure the host program uses it with the CL create program with binary function. In normal operational mode when Cl create program with binary is called at the host the AOCX file is used to program FPGA configuration either across PCIe or using the FPGA manager on the SoC devices. If you use the emulator mode to create the AOCX file, the AOCX file instead of representing an FPGA bit-stream is a library that can be dynamically linked to the host program and run on the host. Before using the Intel FPGA SDK for OpenCL, it is important to understand the hardware requirements. The easiest way to get started is to use an Intel FPGA preferred board for OpenCL. These are boards designed by our partners who also provide a board support package compatible with the OpenCL SDK. After you have FPGA board, download the board support package from that vendor. The BSP includes hardware information required by the OpenCL compiler. This includes precompiled peripherals including interfaces to and from your custom kernels. It also includes a software layer which provides libraries which allow your OpenCL host calls to be relayed to the board. Finally, it also includes the PCIe drivers necessary for hosted device communications. Besides the accelerator board, you'll also need a host and a method of host communication. These FPGA boards typically use PCIe for host device communication. So this usually means that a PCIe port on the host machine is required. Guaranteed timing closure. Software developers writing OpenCL code do not want to be considered with FPGA and hardware development details such as meeting timing requirement. Therefore, platforms must always meet timing requirement by adjusting the phase lock loops, the PLLs that drive the FPGA clocks. With OpenCL platforms you always make timing. The board support package automatically adjusts the PLL that drives the kernel clock, so software engineers never have to worry about timing. To compile the host code you'll need to use a standard C compiler such as Microsoft Visual Studio or GCC. When you compile makes sure the INTELFPGASDKROOT/host/include directory is visible to the compiler. To simplify this you can call AOCLcompile-config in your make file or in your compilation. In your host C source code include the OpenCL.hheaderfile. Lastly, when linking to create the executable, make sure the Libraries and the host static directory are provided to the linker. The AOCL utility is again helpful with this. You can use it with a link config option to specify the directory in your makefile. Let's now talk about some capabilities of the AOCL helper utility. The AOCL utility has many features related to driver installation, flash programming, and compilation. AOCL compile-config and AOCL link-config are commands that can return a set of flags for compiling and linking the host program respectively. AOCL makefile will output a makefile fragment for compiling and linking an example host program using the GCC tools. The AOCL install command will automatically install the drivers needed for your specific board. The AOCL diagnose will execute a vendor specific test program to ensure the board is connected and running properly. The AOCL flash will program the on board flash with appropriate FPGA image, so that when the board powers up it can communicate with the host. The AOCL report is used to view the kernel compilation reports that we will get into more detail coming up. The host program can program the FPGA would the kernel code in a few ways. A valid OpenCL compatible image must be configured onto the FPGA prior to host application execution. To establish host to device communication the host may overwrite the core of the FPGA with a new kernel circuit. So AOCL program is used to program the FPGA directly, and this is usually across the PCI interface or a JTAG cable. AOCL flash is a command that programs the user region of the flash on the board. The FPGA loads up from this flash upon power-up. Now we see all the tools. Here's how it all fits together. Here's the entire flow necessary to compile and run OpenCL using the Intel FPGA SDK for OpenCL. Everything in the top rows are setup tasks. You first need to install the Intel FPGA SDK for OpenCL and the device files to support the board you will use. Then install a standard C compiler. Lastly, you need to install the BSP you have chosen for your platform. If you're using an Intel FPGA development kit the BSP might have been installed for you along with the OpenCL SDK. Then run AOCL install to install the drivers for your accelerator card. The second and third rows are the development tasks. When everything is set up you will use the standard C compiler to compile your host program for your host platform. Then you develop your kernel code, first compiling it using the emulator features. This mode gives you the very quick turnaround time necessary to perform functional debug. After functionality has been achieved you can use the offline compiler to compile the kernel with the profiling feature turned on. This feature allows you to roam the host and the kernel on the board to verify performance. Once you run the application and you see the performance and the metrics you are getting on your code you can go back and make changes, recompile, re-emulate, and rerun the compilation and profiler to see how to improve your code and get better performance. Thank you. This has been lesson 1 for Module 5.