Welcome back. In this lecture we'll talk about some programmable data planes and the role of an intermediate representation in compiling high level languages like P4 down to various hardware targets. First, let's talk about why we need a programmable data plane. Emerging data plane devices enable increasingly flexible control, on how they process packets. In terms of defining custom packet header formats, altering the number of stages in a packet processing pipeline. Adding state such as tables to various stages in the pipeline, and specifying new data plane functions for packet processing. The fixed nature of open flow devices makes it hard to add new protocols or remove protocols if they're no longer needed. Recently, many new protocols are being developed for serving different applications and data center and enterprise networks, including GRE, VXLAN, and BFD. There's a need to rapidly deploy these protocols in the network without waiting for new open flow specification of chips to spin up which may take months, if not years before they're actually deployed in practice. Therefore there's a need to have some type of programmability in the network devices to allow for rapid deployment of these protocols. The data plane device should be able to operate on arbitrary packet locations and provide a means for specifying packet operations using high level network policy. For example, the policy shown here tells the data plane to read VX Lan and internal ethernet headers for the packet. Compare the IP and VX Lan with some IPX, and if there's a match, it removes the VX Lan header and internal ethernet headers from the packet. If there's a match it removes the VX Lan header from the packet and forwards it through port 0, if the internal ETH header matches. Given such programmable devices, the challenge then is how to compile these high level policies to such variable targets. A previous slide just showed one example, but we expected number of programmers will organize their programs into libraries and composable models that will likely be reusable. And maybe written in entirely different high-level languages. Programmers may use these libraries to write more complex packet processing programs by composing modules into a single packet processing pipeline. Programmers thus need mechanisms for compiling these modules to a single hardware target. To understand the motivation for an intermediate representation, it helps to think about the history of programming language. And how compilers for those developed. Having multiple high level languages can make it difficult to achieve direct compilation to a hardware target. For example, with languages like C, Java and Python, compiler designers face the issue of how to compile these languages to different targets. Instead of compiling each language directly to a given target, designers developed an intermediate representation that acted as a sweet spot that divided the compiler tasks into two phases, a front-end and a back-end. An intermediate representation needs to be both language and target-independent and should be expressive enough to be produced by a language specific front-end and it must be functional enough to produce layouts for a diverse set of hardware targets. The IR should permit the compiler to optimize packet-processing pipelines using both target-specific and target-agnostic optimizations for area, power, or latency. And also optimize the layout or the resulting packet-processing program. We expect that programmers will organize their policies into libraries and modules, possibly written in different languages such as P4 or Protocol Oblivious Forwarding. These libraries might be accessible through public repositories like GitHub. A network programmer can then take these modules and compose them to write more complex policies, and install them on a data plane device. Unfortunately, installing such policies is not straightforward and there are many challenges and issues. For example, an axis controlist might only be operating on an IP source and destination. In which case, the rest of the fields are unused and shouldn't be a part of the final policy. However, a naive compilation will add these fields into the data plane. Thus we need mechanisms to ensure that such redundancy is removed from the final policy. And that the policy is officially compiled to the underlying target. NetASM is an intermediate representation that acts as a narrow wase between the languages that are beginning to emerge such as P4, Click, and Concurrent NetCore, and a growing diversity of target for back-ends. It enables a common platform for writing optimizations to programmable devices. It provides an abstract cost model, persistent state for storing information across packets. And in order to efficiently map these policies to different targets, NetASM provides several modes of execution that can be applied together to implement complex execution paths through various devices. The language has 23 primitive instructions for implementing these network policies. NetASM thus acts as a common intermediate layer between these higher level languages in which modules may be specified and lower level data plane targets. Having this intermediate representation improves the quality of code using optimization such as code-motion and dead-code elimination. And using NetASM a compiler can perform conventional data and control flow analyses. The compiler can also use a target agnostic or target specific cost models to apply such optimizations. And the optimizations maybe targeted to improve metrics including area, latency, and throughput. And future work on the compiler may include target-specific optimizations based on target-specific information or cost models that are specific to that particular target. Let's take a quick look at NetASM in action. In this short demonstration I'll instantiate a NetASM data path on a pox switch with three virtual ethernet interfaces. We're first going to use the setup script to setup three virtual ethernet pairs. Let's now start Pox in standalone switch mode, but specify a particular policy. And give the switch three ports according to the virtual interfaces that we just created. Effectively we've started a Pox switch running a NetASM data path that acts as a hub. Let's run TCP though on one of the switch boards and then let's try to run our ping command through the switch. At this point because we haven't setup a network topology, of course the pings to do not reply. But we can see the switch issuing ARP requests. Here's the Net0ASM policy that we just ran. Almost every program in that ASM starts by importing the NetASM core module. The code shown here. The code shown here is a tuple consisting of fields and instructions. In this case the list of instructions has a simple operation. To exor the input port bit map with the port count bit map value of all 1s. And to store the result in the output port bitmap. Effectively this operation tells the data plane to perform a flood. In summary, NetASM is a common intermediate representation for programmable data points that enables a compiler to optimize a high level packet processing program for a diversity of targets. It uses a target independent machine model and cost semantics to optimize the program, which leads to better architectural realizations. Some future work includes completing the language specification for NetASM and exploring opportunities for optimizations that can be applied across different classes of network devices.