So can you tell me a little bit about a project that you worked on? >> Yeah, I'd love to. So let me tell you a story from a project in computer architecture research. >> Okay. >> So are you familiar much with computer architecture research? >> So-so. >> Okay, so the high-level design of a processor is focusing on performance, or liability, or power. >> Okay. >> And everything will become more clear as we kind of walk through. So let me give you just a quick example to get started, and we'll keep going. >> Okay. >> So let's say you've got some program. Let's say I've got program A. And I want to run it. Well, I want to run it, likely, on a processor. >> Okay. >> Now, in the old days, and by old days, I mean 2004. [LAUGH] >> [LAUGH] >> You'd only have one choice when you ran the program. It can only run on the chip because the chip can only run one program at a time. >> Okay. >> But actually, that was before we started gettng multi-core processors. >> Okay. >> And have you heard about those before? >> Yeah, all the laptops these days have quad core, all these other- >> Exactly! That's exactly what it is. So you have more than one core on a single processor. You've got your four cores here. And now you have choices. >> Right. >> So I could execute program A on core 0, or core 1, or core 2. Any of these cores can run program A. And there were a couple of research proposals at the time to kind of exploit this for various reasons. >> Okay. >> I'll just a couple of concrete examples. The first was, so if I scheduled program A to run on core 0, and I always did that, I always ran my programs on core 0. What happens is that this part of the chip will get really hot. >> Mm-hm. >> And this part of the chip will be cold. And a thermal difference in a chip is really bad. It actually causes the processor to die faster. >> Okay. >> So what you could do, a really simple solution would be just to migrate execution from core 0 to core 1, from 1 to core 2, from core 2 to core 3. And in doing so, then you kind of spread out the thermal wear. >> Right. >> So this was an approach to improve the reliability of the chip, but relied on you jumping from core to core to core. >> Right. >> I'll give you one more example. Actually, part of my PhD thesis was trying to take program A and divide it up into different threads, so different sections of code. And execute those sections of code on different cores. In effect, I would speed up program A. But to do that, I'd have to migrate computation again from core to core to core. >> Okay. >> The details of which aren't too critical. It's just the fact that we have the same behavior of migrations, but now to improve program performance. >> Okay. >> So what sort of happened when we looked at this idea of migrating from core to core was we found that it actually causes these major performance drops. >> Mm-hm. >> So when I tried to go from core 0 to core 1, I would see a drop in performance by about a factor of 8 for the first 10 to 100 instructions. >> Okay. >> And then I'd see a factor of 2 drop, even out to the first 10,000 instructions. >> So we'd see these big drops in performance going from core to core. And the reason for that is that as you've been running program A on core 0, essentially, the kind of longer you kind of run on there, the more you learn about the program. So modern processors actually learn about what your program is doing in order to make it perform better. >> Okay. >> So an example would be it grabs the most pertinent data that your program wants and keeps it really close to the computation engines on the chip. >> Okay. >> And essentially, by doing that you speed up performance. >> Okay. >> So the longer you're there, the faster it runs. Does that make decent sense? >> Yeah. Yeah, and so then when you go somewhere new, then this one wouldn't have learned the same computation and the same processes. And so it's not going to be able to run as fast. >> Exactly! >> Okay. >> That's exactly right. Yeah, so you jump from core to core, then you lose all that learning. >> Right, okay. >> So what I really liked about this project was how when you looked at the problem, the kind of obvious solution was actually completely wrong. So the obvious solution is you take all of your learning and you transfer it to the other core. >> Right. >> And the problem is there's just so much that the core has learned that you can't transfer all that information and still have it perform well. >> Okay. >> You'd end up taking so long to transfer the information that, again, it would cause even more disruptions, in terms of execution. And to be frank, it just was too much information. It wasn't all stuff you needed. >> I see. >> So what our research project became was basically designing this little piece of hardware on the core, which would keep track of the most relevant parts that you've learned. >> Cool. >> And then at the time of migration, you just essentially transfer that little bit that really mattered about your learning to get this other one started faster. >> Okay. >> And there's a whole bunch. I'm kind of figuring out which pieces to learn. >> Yeah. >> But if you got it right, you're looking at about a factor of two in performance gain. >> Wow! >> At the point of migration. So I do want to mention this was joint work. I did this with a couple of other researchers. My main component was looking at that thesis work, which was dividing up the program in running on these cores. And my colleagues also help solve this problem, as well. >> Wow, that sounds great! Thank you.