0:04
Okay. So, now we're going to change off of
machine model and talk about other aspects of instruction set architectures.
And, to talk about what else is in instruction set architectures, well,
there's the fundamental machine model, how many registers you have, whether you, what
type of register access you have. Do you have stack-based?
Do you have accumulator? Do you have a register-register, or a
register-memory architecture? Also, you need to talk about what the
fundamental operations that you have, the fundamental instructions that you have.
So, let's look at classes of instructions. We start off with things like data
transfer instructions. So, loads, stores move to control
registers. So, this is what MIPS has.
And, in this course, we're going to be relying a lot on MIPS, the MIPS
instruction set architecture, a lot for our example cases.
But , you have load, store move to and move from control registers, with
different control registers. You have arithmetic logic unit
instructions. So, things like adding, subtracting, and
ignoring multiplication, division. This is an interesting one here, Set Less
Than, that's kind of a fun one. It's a comparison operator.
So, if you want to take two values and compare and see which one's less than the
other, you can use Set Less Than. Load Upper Immediate, this is, moving a
value into different location registers, how to shift operation.
1:38
You can have control flow instructions. So, you can do branches, jumps, traps.
And, one of the points I want to get across here is within, or between
different instruction set architectures, people make different choices about which
instructions to have. Some people have very complex ones, some
have very simple ones, or some of the architectures are very complex ones, and
some of the architectures are very simple ones.
You have floating point instructions, adding floating point numbers, multiplying
floating point numbers, subtracting floating point numbers.
These are actually compare operations, oh, excuse me, this is a compare operation
floating point numbers. So, compare less than for doubles, or
double precision floating point. Here, we have conversion operations.
So, it's conversion from a single precision floating point number to an
integer number, or integer word op, number.
So, this is a convert, this is a, this is the MIPS instructions.
You can have multimedia instructions, or what's called single instruction multiple
data. And, we'll be talking about send the bunch
in this course, later when we get to data parallelism and vector units.
And, this is actually an example out of x86.
I wanted to give of stranger operations that sometimes show up as fundamental
operations or fundamental instructions in instructions set architectures, is, this
is an example called REP MOVSB. That's not two instructions, that's one
instruction with a prefix and a space in between it.
Yup. This is actually valid Intel assembly
code. And what is REP MOVSB?
Well, REP MOVSB is a string operation where it will actually copy one string
into another string. So, if you have some text and you want to
copy to another text, piece of text, you can do REP MOVSB and set up a number and
it will actually copy. This is the more equivalent of something
like store and copy. So, we can do that all in one instruction.
So, in addition to these complex string operations, things like REPS MOV, REP
MOVSB, we can see, there were sort of old jokes about having extra and extra
instructions, and having really complex instructions.
So, for instance, in the VAX architecture, they had instructions that could do very
complex things. I think there was one that even did a Fast
Fourier transform in one instruction. That's like a whole Fast Fourier transform
or, across a huge data set in one instruction.
So, you can see that there's a lot of choice between your classes of
instructions and the ISA architecture, the instructions that architecture, architect
has to sit down and think about what should be in an instruction set versus
being left out of an instruction set. Another characteristic of instruction set
architectures that the architect needs to think about is, how do you go and Access
Memory? And, what are the different addressing
modes that can be used? So, or how do you get operands from
memory? So, looking at one example here, we have a
register-based addressing mode. So, in a register-based addressing mode,
we can only name two registers and put them in another register.
5:02
And, this is a, a three operand format here.
X86 will have only, two. But, you name, Register three, Register
two, add them together and put them into Register four, for instance.
And, one of the, the interesting things here is, this may not actually access any
memory. We call them memory mode, but it may not
actually access memory. If you have enough register space and your
implementation or your micro-architecture actually implements all the registers,
then it won't go access memory. But, it might access memory.
So, for instance, there are machines out there where you have a register, register,
register operation, or register, register, register instruction.
But, the processor has no register file. Everything is out in main memory.
So, it has to go read the data from main memory to go actually do the operation and
it just sort of caches, or keeps the two operations that are needed.
And it's all at the micro architectural level.
So, this is all the big A architectural level and asking, what is the fundamental
memory operations that can be done? So, that's a register-based addressing
mode, we can have a media based addressing modes.
So, here we have something like a constant of five being added to register, putting
it into another register. So, here's our assembly code for that.
You can have displacement-based addressing.
So, in displacement we're going to take a register value, add it to a some constant,
and then take that and look up in main memory, that location, and do some
operation, let's say, of another register. But, this is displacement based, and it's
called displacement because you can take a register and have some displacement off of
it. You can have register indirect, and this
is, pretty common on something like MIPS, or actually if you go look at the Itanium
instruction set. They don't have displacement stuff, they
only have register indirect. So, this is similar to the displacement,
but you can't have a displacement. You can only go and read from a particular
memory address that's stored in the register.
You can have absolute addressing. This is actually not very common on most
modern-day architectures. But, in the older, older machines, this
was common so you take memory and take a constant, it's not out of the register,
and go look up in memory, and then do some operation with that.
You can have memory indirect. And this is kind of interesting way to
denote this here. Mips very much does not have this.
But, you could do a memory operation of a memory operation of a register.
So , what you'd have is, in a register, you'd have an address.
And then, you would take that address, you'd look up in main memory, get the
data. And that itself is an address.
And then, you'd look up in main memory again with it.
So, it's sort of a double index based off a, a, a register sort of addressing mode.
And, that's, that gets pretty fancy. So, if you look at something like VAX,
they definitely had this. You can have PC relative, or program
counter relative, or instruction pointer relative addressing.
So, you can take the program counter, add some displacement, and then index memory.
This is very useful for position independent code, or code that you don't
know where it's going to be loaded. And, if you want to go access some data
close to where the code is, you don't know exactly where the code is loaded.
But, the program counter, because you know what instruction your executing, you can
basically index off that and find memory around where you are, around where your
loaded in main memory. So, this is for a PIC code.
You can also have scaled. This is something that x86 has where you
can actually take a register, and add it to another register and multiplied by
something else. So, in x86, this is called SIB, scale,
index and base mode. So, you can actually take a displacement,
add it to some, to registers and multiply it.
And, this is very useful if you're trying to index through and array of some size.
So, if you have an array of four byte words, you can just keeping ticking up
this counter here. So, you start off zero, one, two, three
and as this ticks up here, instead of going up by a byte, you go up by four
bytes at a time. And if you're, the data you're trying to
load is four bytes long, you'll actually be able to just pick up the exact elements
in the array you want versus having to do this multiplication someplace else.
Usually, these scaled operations, or scaled memory addressing modes have very
limited sort of, multiplication here. You can't multiply by, let's say, seven.
Usually, it's sort of, multiplication by factors of two or a small set of factors
of two because that's, that's easy. That's just a shift operation in base two.
10:13
And then, you can think about data types and their sizes.
So, what do I mean by data types? Well, you could have binary integer.
You can think about having different types of integer data.
You can think about having, unary encoded, binary encoded.
You could think about having, things that are, sort of, roll in different ways.
So, for instance, as you probably learned about in your computer organization class,
there's ones complement versus twos complement arithmetic, and that's
different data types, there. So, you have binary integer data, and
saying whether it's ones complement versus two, twos complement is, is pretty
important. You can have binary coded decimal.
So, this is where each digit is encoded with four bits from each decimal digit, if
you will, is encoded in sort of the pointer.
It's going to be the period, if you will, is, is also encoded in there between your
fraction and the integer portion or the, the, the, the natural number of portion.
11:31
So, your binary coded decimal can have different, very exact calculations for
things like spreadsheets and business calculations.
You can have floating point types. And there's actually a lot of different
floating point types here, you can have, there's a standardization now that's
called IEEE 754, which is what's used in most modern computers.
And, this was different than the Cray floating points on Cray supercomputers.
They had a much wider floating point, and they also had difference number of bits
given to the mantissa versus the exponent. And by doing this, their precision can be
different in different ways. So, for instance, you can have a bigger
range of numbers but the precision's smaller, or a smaller range of numbers
with bigger precision. And, there's different trade-offs there.
Also, Intel internally, at least in x87, had this thing they called Intel Extended
Precision which is 80 bits long. Ieee 754 the biggest thing to find in
that, is a 64 bit double. But, if you want even more precision to
your floating point numbers, you might need 80 bits.
You could have packed vector data. This is like MMX data where you're trying
to pack the data all together and operate on it at the same time.
So, typically, things like MMX, you need to bring the data into a packed data type,
and then operate on a whole data type so which has different values in it.
And, some architectures even have a special data type called addresses which
is different than a binary integer. So, some older computers actually had
address registers. And the address data type was different
than the data, data type, or the binary integer type.
And, that was different than the floating point data type, and there was different
registers and different register names for that.
And, what was nice about that is, they knew that if you loaded something into the
address registers, it was definitely an address.
So, it had type information, and that's separate from the width.
So, let's say, you have binary integer. Well, people have built machines which
have eight bits, sixteen bits, 32 bits, 64 bits.
All these different things that is sort of the default word size.
And then, finally, one of the important things you need to do is come up with the
encoding of the different instructions. And, there's been a lot of debate on this
of should you have fixed width versus variable width instructions.
So, let's look at a couple of different ISA's and see where they fall, what camp
they fall into. So, most risk architectures are fixed
width. So, you have, MIPS, Power PC, SPARC, ARM,
falling into this category. And, as an example, MIPS which we're going
to be talking a lot about in this course, is, every instruction is exactly four
bytes long. And, what's nice about this is it's easy
to code, but it may not be very compact. On the other side of the, of, of this
question about ISA encoding, you can see variable length instructions where the
width of the instruction can vary widely. So, what's nice about this is you can have
things that take up, things that are very common to take up a very small amount of
space. So, if you have an instruction which is,
like, one byte long and it's always called, you could effectively do a manual
Huffman encoding on your instruction set. So, you take the most common things, and
you put them in the smallest amount of data.
But, if you have something that's very uncommon, you can have it take a lot of,
lot of bytes. So, example here, x86, you can have
between one and seventeen bytes for an instruction.
I think this has actually been updated now.
If you look at x86-64, it can be between one and eighteen bytes.
So, and a couple ideas here. It can be anything in between.
One, two, three, four, all the way up to eighteen.
And, some CISC architectures, you have, IBM360 is a good CISC, example of a
complex instruction set architecture is x86, Motorola 68k, VAX, these were all
variable length instruction encoding architectures.
And now, we search again with some thing which a little fuzzier.
There's things that sorta start to cross over.
People started to look at, started to build mostly fixed or compressed
instruction set architecture. So, an example of this, is something like,
MIPS16, which is effectively a MIPS instruction set where there is both 32
bits or four byte instructions, and sixteen bit or two, two instructions.
16:39
compressed. So, this is like a, a mostly fixed
architecture with sort of two different instruction sizes.
If you look like something like Power PC and VLI, some VLIWs, they actually have a
compressed file, compressed format where they will actually store the instructions
compressed and decompress them when it ends up in main memory.
Or, ends up in the caches, at least. So, you can think of some architectures
where the, the code in main memory is small.
But then when we get to the cache it, maybe it gets expanded or gets expanded
when it comes out to the main processor. And then, there's long instruction words
where you actually can explicitly name multiple instructions happening at the
same time. Or, even very long instruction words, or
what's called VLIWs, which we'll be studying a bunch in this course.
Where you can put multiple fixed-width instructions in a, or multiple
instructions in a fixed-width bundle. So, some good examples here are Multiflow,
the LX architecture from, and also from STMicro, the LX [inaudible] architecture
from HP and STMicro which is, shows up in printers today, mostly.
Ti DSPs are actually VLIW architectures, and a couple of other good examples.
So, just to show here something complex of how you can end up with one to eighteen
bytes, here, we have x86's instruction set.
And, fundamentally you need an opcode, a byte worth of opcode.
But, you might, some instructions might have between one and three bytes here.
And then, there's different addressing modes, special information about different
addressing modes, displacements and mediates about the different addressing
modes. And those all take up more space.
And they can also have prefixes so that REP, REP in REP, REP MOVSB is actually a
prefix, which says, repeat this operation multiple times.
You can code all these things in a variable with instruction format like x86.
And, to give you an example, something like MIPS, every instruction on MIPS is
exactly four bytes long and they have to fit everything into it.
So, a ISA architect or Instruction Set Architecture architect has to decide the
layouts of the bits within the instruction set and that's usually something that is
defined in the instruction set architecture.
So, to sum up some real world instruction sets and where they fall with different
numbers of operand, operation, number of memory operations, data sizes and
registers, let's walk through a couple of different instruction set architectures.
And, you probably heard these in past, heard these in passing but you may not
have actually used any of these machines. But, that's because some of them are
embedded or some of them don't aren't commonly used anymore.
But, they're good to know about. So, let's start off with Alpha.
Alpha was built by Digital Equipment Corporation, and it's a register-register
architecture with three named operands. There's no explicit memory operands in the
instruction set, it's got 64 bits as the default data type.
And when, actually, Alpha originally came out, you could only do 64-bit operations
with it. That will, sort of, later change as they
figure out that might not have been the best idea.
64-bit addressing, it was mostly designed for workstations.
So, big addresses, fast computers, they can see something like ARM.
Arm is used in my cell phone. It's a architecture that there's a lot of
different implementations of, and they've licensed it to lots of different people,
but it's also register, register, register, three operands.
There's a, a 32 and then now is a 64-bit data size that has just come out.
30, it's going to be sixteen registers, and the addressing, as I said, is a 64-bit
version came out but it's mostly 32. And, it shows up in cell phones embedded
applications. Mips which is, an outgrowth of the
Stanford MIPS project and later was commercialized.
Register, register, register, we're going to be focusing on this mostly in this
class. Sort of similar workstation embedded.
Sparc is another instruction set. This is what Sun originally used, or used
to use. It was an outgrowth of their risk one, and
risk two sort of, architectures. It has, well, this is, this one's
interesting. Between 24 and 32 registers depending on
how you, you look at it. They have this interesting idea where as
you load more data in, sort of, or as you do function calls, data gets spilled out
into main memory and gets pulled back in from main memory, kinda like a stack.
So, it's sort of, a mixture between a stack and a, a register architecture.
Most of these were workstations. You can see that TI C6000, more for DSPs.
But then, we're going to start to see some more interesting stuff down here.
Let's take a look at VAX. So, VAX is a memory, memory architecture
where it has three named operands, or could have up to three named operands, and
all three of those can come from main memory.
23:20
And then, also, instruction sets are many times influenced by their applications.
So, a good example of this is, if you're building a signal processing architecture
or a, or a Digital Signal Processor, a DSP, you might want to add DSP
instructions. And then, finally, I want to talk about
how technology from software has influenced instruction set architecture
over time. So, if we look at something like the SPARC
architecture, it has what's called the register window.
So, in the register window, what happens is whenever you do a function call, it'll
actually take eight registers and put them into memory, and then you get eight new
registers. When you do a return, it takes eight
registers from main memory and puts it back into your registry file, and sort of
swaps out the ones that were there before. And, what this was, was at the time that
SPARC was made, compilers didn't know how to do register allocation.
Then, you'd lose like an open problem. Since that time, register allocation,
figuring out how to take a fixed number of registers and move data in from a stack in
main memory and vice versa can be orchestrated very effectively and very
efficiently by the compiler. But, at the time, compilers were very
simple. So, people didn't know how to do that, so
they needed hardware help to do that. So, the instruction set architecture has
that build baked into it. But, now that we have effective register
allocation, we've not seen any other register windowed architectures come along
after that. If you talked to anyone who's actually
went and implemented a SPARC instruction set architecture, micro architecture, they
basically hate register windows. It's like the bane of this architecture.
But, at the time compiler technology was not good enough.
So, applications influence it, compiler technology influences your instruction set
architecture. Technology influences your ISA, and ISAs
have evolved over time. Even though, as we said originally, you
know, a lot of times people want to build ISAs that don't change so you can keep
running software that have binary compatibility.
But, you know, at sometimes, at some point, it might make sense to actually
break the compatibility and re-optimize your instruction set architecture.