[MUSIC] Okay, so we've been talking a lot about CPUs and microcontrollers. There are other sorts of computational platforms that are used in the context of IoT. And we're going to be talking about some of these today and giving you some intuition about when to use these different platforms and in what contexts. And we're going to start out by talking about FPGAs. FPGAs are field-programmable gate arrays. They're a lot like ICs, they're a lot like integrated circuits, they're little circuits that do things in hardware, but they're programmable circuits. So when I buy and FPGA, it's blank. And what I can do is I can download some hardware onto it. FPGAs consist of a bunch of programmable logic blocks, things like logic gates. So what I can do is I can write little AND gates and XOR gates, NOT gates, and kind of hook them together to do logic. I could do all sorts of complex combinatorial functions. And there's little interconnects in there, which I can turn on and off to hook them together. Many FPGAs can be reprogrammed after they're deployed in the field, and this enables really flexible reconfigurable computing. So when NASA built the Mars rovers, they put them up on Mars and they needed some hardware in there to do some various sorts of things. And they could have put regular hardware in there, but what if there was a bug? They can't go up there to the Mars rovers and fix them. So what they did is they used FPGAs. FPGAs are great for this, you can deploy them, you can get them in the field, and you can reprogram them remotely, even though they're hardware. So you can fix vulnerabilities in bugs post-deployment. So with FPGAs, they're very flexible, you can write whatever circuits you want. And so this allows for parallel execution. Unlike CPUs, which traditionally have one thread or a small number of threads, the benefit of hardware is you can be very parallel. You have a lot of pins to work with, you have a lot of data going through. So this can end up being more power-efficient than single-threaded microcontrollers for some applications, like machine learning, AI, data processing. When you work, you have really big data sets. However, one challenge with FPGAs is that we don't really know how to build them very cheaply yet. The current technology targets the high-end market, they're kind of expensive. If you want to get a good sized FPGA, we're talking like $500 or more. There are certain key players in this market, like Xilinx, and Altera, and Intel, and these make FPGAs. Now, that said, some devices have really small FPGAs in them. You'll find microcontrollers have little tiny FPGAs, and those are cheaper, and those are made by more vendors. So those are FPGAs, very powerful platform if you want to do virtualized circuits. Another kind of technology I want to tell you about are ASICs. ASICs are application-specific integrated circuits, they're ICs, they're basically just ICs. Application-specific integrated circuits are ICs that you build when you can afford to build your integrated circuit, when you have a certain application-specific scenario you're encountering. So ASICs are situations where you take your design and you burn it into the silicon. So you can have hundreds of millions of logic gates. And it can include not just logic gates, but it can include things like ROM and RAM, electrically erasable ROM, flash memory, and other sorts of building blocks besides just basic logic. That said, the core logic is typically not field-programmable and it's hard to update once you do your designs. To make these, you're doing the whole photolithography process, you're taping out your designs. That said, sometimes it's possible to put a little bit of reconfigurable logic in them as well. ASICs and FPGAs also are used to implement things like CPUs, and microcontrollers, and so on. They're also used in context where you want to do something that's kind of very specific to a certain domain. But you're a big company, you can afford to do it. Cisco used to build ASICs to do things like IP forwarding. You have your router, a bunch of IP packets coming in, you could process them on software. But with an ASIC, you can process them in hardware and get your packets going through really fast. In routers, so they are still built using ASICs. So the typical design is what you'll have is you'll have an ASIC and it'll do something like it'll have an IP core for the CPU, it could have a digital signal processor on it, communication interfaces, memory, and so on. But because they're taped out, because you go through that whole masking process, you have a very high non-recurring engineering cost of millions of dollars. The first one you build costs millions of dollars, then you can stamp them out quick. So ASICs are best for very large production volumes, and if you have a lot of shared functionality. So you're building one that's useful for a lot of devices. ASICs are built by many companies, some of the biggest players are TSMC, GlobalFoundries, Samsung, Texas Instruments. So these are big companies, because for ASICs, you need a very expensive process to kind of build them out. In addition to ASICs, there are also GPUs. So GPUs are graphics processing units, and these are devices that have been constructed for the purpose of processing graphics in hardware. So they're designed to very rapidly manipulate and process images and video. And the idea behind GPUs is this observation that if you have video or you have an image, it's a lot of pixels. And typically what you're doing is the same operation of all those pixels. Maybe you're taking your image and you're kind of rotating it, or you're doing some sort of special effect on it. Usually that implies that you're doing some sort of function. It's a complex function, so you can't really do it with circuits, you want something that's complication-based. But it's also very parallel, you're doing the same thing over a lot of little blocks in your image. So therefore, most GPUs are built using a lot of threads, like a very parallel CPU. And you might have like 500 or 1,000 threads or stream processors, all operating in parallel over your image. GPUs are very commonly used in computers to drive displays. They've really taken off in mobile phones because mobile phones have displays, and you rotate and you do all sorts of things on there all the time. They're used in game consoles, they're used in self-driving cars, like Tesla's autopilot. The car is driving around, it has to quickly look at images and figure out if that's a stop sign or if it's a person. Drones flying around, automated surveillance, and so on. GPUs contain hardware implementations of important functions for graphics. So things like texture mapping, or rendering polygons, or rotating or translating vertices, or if I want to do shading or various sorts of special effects. GPUs are good at that, they have that built in. GPUs have also been found to be good at non-graphics applications as well. So people have found out, especially a lot in the past ten years, that you can use them for Bitcoin mining, and matrix operations, and audio processing, and even machine learning. Because they're so parallel and they're so customizable because they're generally, they have general computation in them. Many GPUs are built by NVIDIA, and Intel, and AMD. These are kind of the key players in this space. Another type of technology I want to mention is called system on chip. So system on chip is not really different than the other technologies that I mentioned. System on chip is a CPU or it's a microcontroller at its core. But typically what we do is we take a CPU or a microcontroller and we kind of embed other hardware functions around it. And we just put it on one chip to save money. So it'll be a CPU. Or microcontroller. But it's got an AI logic unit, something to accelerate artificial intelligence on it, or something to accelerate machine vision, or maybe something to do wireless communications, or digital camera hardware, and so on. So as you can imagine, these things are really popular in the context of mobile phones, because in mobile phones that's exactly what you want to do. You want something that does general computation and then a bunch of other stuff around it communicating over 4G wireless or whatever. So system and chips, FPGA, GPUs, these are kind of a bunch of technologies that are out there right now which are very widely used in the context of IoT. I also want to mention some kind of emerging platforms that are on the horizon, that are already being produced and already being used. And the idea behind these is we're building hardware and hardware platforms to accelerate the various sorts of complication that come up specifically to IoT. The one technology I want to mention is vision processing units. VPUs excellerate computer vision. So these are very useful in situations where you have a device, and with a GPU you're displaying information, you're processing images. With vision processing units, you're doing computer vision, you're doing object recognition. So these are very useful for cars, or facial recognition, accelerating data hardware. And this turns to massively improve your speed at doing image recognition and lowering your cost. There's a lot of platforms out there that are doing this, and these are right now being introduced to the market. Another kind of programmable platform, is the neural network processor. So in a lot of AI applications or machine learning applications, there's a lot of algorithms out there, but one algorithm that's really winning a lot is neural networks. Neural networks are kind of a state-based approach to artificial intelligence. You got a bunch of states and you're storing information in these states. So it's called a neural network. And neural networks are very powerful. They're used for voice recognition, text processing, facial recognition, used for many different applications. Ans so what companies are doing is when they really need to do this at scale, they can use CPUs. But instead what they're doing, is they're building neural network processors which are hardware platforms which vastly accelerate the speed with which we can do neural networks in your own network analysis. And these are being built by a lot of companies that do a lot of neural network processing like Google and Amazon and so on. Another platform I want to mention is machine learning processors. And the recognition here is neural networks are great for a lot applications, but there's other algorithms in machine learning, and sometimes we need general platforms to do machine learning processing. So MLPs are platforms that provide general programmable platforms for machine learning. But they have hardware support, so they're very fast and very low cost. And ARM's Trillium platform and IBM's Power9 platform are examples of these. So next I want to talk about how you actually do hardware a little bit. So you can go out and design stuff on your own, but recently there's been this advent of open source hardware. And it's important to pay attention to this because when you're designing stuff some of the stuff you want to do is really complex. I mean how would you go out and build something that could recognize human faces? That's a huge task. Or something that can do analysis of data sets over a large scale. The problems you face in IoT are very challenging, and so it can take a lot of work to deal with it. So don't do that. Go out and use designs that are already out there. This can save you time and effort. And so you don't need to deal with highly advanced complexity and hardware yourself. Takes a lot of time and investment to create. And companies and individuals have realized that reuse cuts your development times and creates better and more stable products. And you can go online, and you could download hardware at all sorts of different levels. For example, you can go on opencores.org and download free IP cores. You can download a whole CPU, and load it on your FPGA area, virtual hardware platform. There's a few terms I want to mention in this context. The first one is cell libraries. So you're going to go online and you're going to look at OpenCores and they're going to be talking about cell libraries. So what cell libraries are is they're libraries of not entire circuit designs, but pieces of circuits. So if you want an ALU or an audio processing unit or something like that, you can download cell libraries for these various things, and you can download them and you can use IDEs to kind of plug them together and use them. There's also a term called IP cores. So what IP cores are is these are cells or components or cores. They're primitives that are often purchased from a third party. So there's some sort of intellectual property associated with them. And there's different kinds of IP cores. If they're provided in a hardware description language, we call them soft macros. But if they're kind of more detailed and we have the actual fully routed design, all the components and their interconnections kind of laid out, that's called hard macro. So, for the former, it turns out you can kind of describe hardware in languages like VHDL and Verilog. And so those are called soft macro if they're provided in that format. Certain vendors provide only IP cores, like ARM is an example of that. So these kinds of vendors are called fabless manufacturers, they don't own a fabrication facility. They sell their IP cores directly. Okay, so a lot of technologies here, when you're designing something you need to decide what to use. Do you use a microcontroller or an FPGA? And so I'm going to show you a table to kind of give you some sense of what technology to use when. You use micro controllers when you want to get really small, and you want to be really power efficient or really low cost, but you don't need much compute power. So examples of platforms that use microcontrollers are things like Apple watches. You want to get really small, really power efficient, but you don't need super complex computation. Whereas if you do, then use a CPU. CPUs are good at diverse applications when you want to deal with large memories or large compute challenges. So these are things that are a little bit bigger, maybe like a Nest thermostat. Amazon Echo Show uses CPU as well. FPGAs are used when you need parallelism. So you need circuits but you also want to be able to reprogram logic in the field. And that happens all the time. I mean, if you build something and deploy it, you're deploying it at scale, you deploy hundreds of thousands of them, and if something goes wrong,are you going to have all your customers rip their thermostats off the wall, mail them back to you? No, if you have some sort of reprogrammable logic, you can update it in the field. You can do a firmware update over the Internet. Save a lot of money. So if you want to do things like that, you want to use FPGAs. FPGAs are more expensive than ASICs ignore the non recurring engineering costs. But they're reprogrammable. FPGAs are used a lot. They're used in smart automotive applications, aerospace applications. And they're also used for prototyping ASICs. You want to build an ASIC, you don't really know what you're doing yet, your pilot, your prototype is going to use an FPGA so you can fix it if you need to. ASICs, on the other hand, are when you kind of know what you're doing well enough and you're big enough that you can afford to pay millions of dollars to fabricate your own chip. So this is high production volumes. And ASICs are used all over the place. They're used to build integrated circuits of various types. And sensors. GPUs are used when you want to do very data intense applications. Or if you're facing a graphics intense application, you may want to consider a GPU. They're also useful for situations where you're doing machine learning, AI, or other situations that can be transformed into a highly parallel problem. And these are use in the context of for example, Tesla Autopilot, which uses the Nvidia platform. So if you're curious about some examples, you can look at the example applications that are listed on the right which give some specific examples of CPU types that are using these various platforms. There's also System on Chip which is used in the context of smartphones as well. System on Chip is useful where you need general and application specific compute in the same environment. Few more example products, just show examples. So CPUs are used in the upper set of products. Microcontrollers are used in the lower set of products. You could see the names, you can see ARM all over the place. So the ARM Cortex M4 is a CPU used in various products up there. Whereas the ARM Cortex M0 is a microcontroller. It's considered a microcontroller sets used in some of the ones lower down there. Okay, so next, I want to talk about memory in storage. So when you're building something, sometimes you need to remember data that you collect. You got a sensor and you collect readings for a while, and then send the data over your network interface. So it turns out there's different kinds of memory that you can use. And there's no universal memory. There's no memory technology that's best for all situations. And so typically, what you want to do is you want to think about what you need, and choose the memory technology that's best and sometimes even use a mix of different technologies. Memory technologies differ in a few ways. Some of them are easily writable. You can write to them a lot of times. Some of them are really hard to write more than once. So if a technology is easily writable, it's called RAM. If it's hard to write to, it's called the ROM. There's also technologies in the middle where you kind of write to them a number of times that they wear out like Flash for example. Memory technologies also differ in terms of their volatility. Sometimes they lose data when they're powered off, that's called volatile or they may retain their contents when powered off, that's called non-volatile. Memory technologies also differ in terms of their performance characteristics. The number of times they can be erased before they wear out. How much the cost per byte? Their speed. How first you can read them or write to them? Things like density, like how big are they per bit? And how heat they generate and energy efficiency. So I'm going to give you a list of different kinds of memory technologies and you're going to refer to this table when you're making decisions about your devices and what you're creating. First type of technology I want to mention is called SRAM, or static random-access memory. SRAM is used in your computers, is used in your laptops, is used right next to the CPU. It's very fast, so they put it on die on the CPU, and it's used for things like caching. SRAM is volatile, so it'll lose its contents when it's powered off. It's writeable. You can read and write to it as many times as you want, doesn't really wear out that much. It's vast, it's also very expensive. And it's used in the context of CPU caches. SRAM can also kind of be powered down when it's idle, so it's good for low power situations as well. Another very popular type of memory technology is called DRAM, or dynamic random access memory. DRAM is also used in your laptops. It's your big backing store memory. So when you buy your laptop and they say it has 16 gigabytes, they're talking about DRAM. DRAM is a different technology than SRAM. DRAM is built using capacitors. And the capacitors lose charge. So there's circuits in that kind of read the data out and refresh it over and over to kind of keep them charged. So because of that, DRAM really needs constant power to refresh it. And so it uses more power than SRAM. So it is volatile, it's writable, it's slower because that type of technology that's used. It's slower than SRAM, but it's much cheaper than SRAM as well. So typically the way we build IoT devices or general computation is we'll have SRAM close to the computational unit and then DRAM farther out if we need a larger backing store. Another type of technology Is ROM, read-only memory. And one particular type of ROM is called a masked ROM. What a masked ROM is, is it's an IC, it's an integrated circuit. You do the photolithography process and all that, but instead of writing gates, you're writing data, you're writing your program. And so if you're building a device that's going to start up, that's going to to do, read as boot loader, read some sort of program. You could put that in a Masked ROM and that would be burned in there. It would always be there. So it's not volatile. It won't lose its contents when it's powered off. It's it's also not writeable. It can be very fast. Because it's an IC, you can put it right together with the CPU. And it's very cheap after this huge non-recurring engineering cost that you pay initially. There's also another kind of ROM, which is erasable. This is called EEPROM, or electrically erasable programmable ROM. And the neat thing about this is you can build a ROM and you can erase it if you want to. You can send commands to certain pins to tell it to erase and you can load a new ROM on there. Now you can't keep doing that because you're going to wear it out. You can't do that like thousands of times because you're going to wear out the little bits that are used to store information. But that's great for a technology where you're going to deploy it in the field, and there might be bugs so you do a firmware update to update the contents of that ROM. So EEPROM is commonly used as a firmware, and it's fast but it's more expensive than a Masked ROM per unit. There's also flash memory. So flash memory is used in the context of those little USB sticks you plug into your computer. It's not volatile, but it is writeable. But it has a limited number of write cycles. You can't keep writing to it over and over a huge number of times. You could probably do it a few thousand times but then bits start to wear out. That said, we're getting better at making flash memories. Over the years, that should get better. Flash memory is fast and has a kind of a moderate price per byte. Flash memory is a type of non-volatile RAM. These are technologies. These are a class of technologies which are not volatile. They don't lose their contents. But often, they wear out with a certain number of reads and writes. Another type of NVRAM technology is a Ferroelectric RAM. Ferrroelectric RAM is kind of the main technology alternative to flash memory that's entered the market. Okay, so I've told you about a bunch of different technologies that are used for computation in the context of IoT. We talked about memory and compute. And next, what I'm going to do is I'm going to get into programming. And we're going to start talking about how to actually program these various devices.