0:19

So last time we ran into this really interesting problem that computing

Â runtimes is hard, in that if you really, really want to know how long

Â a particular program will take to run on a particular computer, it's a huge mess.

Â It depends on knowing all kinds of fine details about how the program works.

Â And all kinds of fine details about how the computer works, how fast it is,

Â what kind of system architecture it is.

Â It's a huge mess.

Â And we don't want to go through this huge mess every single time we try to

Â analyze an algorithm.

Â So, we need something that's maybe a little bit less precise but much

Â easier to work with, and we're going to talk about the basic idea behind that.

Â 1:01

And the basic idea is the following.

Â That, there are lots of factors that have an effect on the final runtime but,

Â most of them will only change the runtimes by a constant.

Â If you're running on a computer that's a hundred times faster,

Â it will take one-one hundreth of the time, a constant multiple.

Â If your system architecture has multiplications that take three times as

Â long as additions, then if your program is heavy on multiplications instead of

Â additions, it might take three times as long, but it's only a factor of three.

Â If your memory hierarchy is arranged in a different way,

Â you might have to do disk lookups instead of RAM lookups.

Â And those will be a lot slower, but only by a constant multiple.

Â 1:47

So the key idea is if we come up with a measure of runtime complexity that

Â ignores all of these constant multiples, where running in time n and

Â in running in time 100 times n are sort of considered to be the same thing, then

Â we don't have to worry about all of these little, bitty details that affect runtime.

Â 2:08

Of course there's a problem with this idea,

Â if you look at it sort of by itself, that if you have runtimes of one second or

Â one hour or one year, these only differ by constant multiples.

Â A year is just something like 30 million seconds.

Â And so, if you don't care about factors of 30 million, you can't tell the difference

Â between a runtime of a second and a runtime of a year.

Â How do we get around this problem?

Â 2:35

Well, there's a sort of weird solution to this.

Â We're not going to actually consider the runtimes of our programs on

Â any particular input.

Â We're going to look at what are known as asymptotic runtimes.

Â These ask, how does the runtime scale with input size?

Â As the input size n gets larger,

Â does the output scale proportional to n, maybe proportional to n squared?

Â Is it exponential in n?

Â All these things are different.

Â And in fact they're sort of so different that as long as n is sufficiently large,

Â the difference between n runtime and

Â n squared runtime is going to be worse than any constant multiple.

Â 3:18

If you've got a constant multiple of 1000,

Â 1000n might be pretty bad with that big number in front.

Â But, when n becomes big, it's still better than n squared.

Â 3:31

And so, by sort of only caring about what happens in this sort of long

Â scale behavior, we will be able to do this without seeing these constants,

Â without having to care about these details.

Â 3:42

And in fact, this sort of asymptotic,

Â large scale behavior is actually what you care about a lot of the time, because

Â you really want to know: what happens when I run my program on very large inputs?

Â 3:54

And these different sorts of scalings do make a very large difference on that.

Â So suppose that we have an algorithm whose runtime is roughly proportional to n and

Â we want it to run it on a machine that runs at about a gigahertz.

Â How large an input can we handle such that we'll finish the computation in a second?

Â 4:24

If instead of n, it's n log n it's a little bit slower,

Â you can only handle inputs the size about 30 million.

Â If it runs like n squared, it's a lot worse.

Â You can only handle inputs of size about 30,000

Â before it starts taking more than a second.

Â 4:38

If the inputs are of size 2 to the n,

Â it's incredibly bad, you can only handle inputs of size about 30 in a second.

Â Inputs of size 50 already take two weeks, inputs of size 100

Â you'll never ever finish.

Â And so the difference between n and n squared and

Â 2 to the n is actually really, really significant.

Â It's often more significant than these factors of 5 or

Â 100 that you're seeing from other things.

Â 5:07

Now just to give you another feel of sort of how these sort of different types of

Â runtimes behave, let's look at some sort of common times that you might see.

Â There's log n, which is much smaller than root n,

Â which is much smaller than n, which is much smaller than n log n,

Â which is much smaller than n squared, which is much smaller than 2 to the n.

Â So, if we graph all of these,

Â you can see that these graphs sort of separate out from each other.

Â If you just look at them at small inputs, it's maybe a little bit hard to tell which

Â ones are bigger, there's a bit of jostling around between each other.

Â But if we extend the graph outwards a bit, it becomes much more clear.

Â 2 to the n starts after about 4 really taking off.

Â Really just 2 to the n just shoots up thereafter and becomes 20 or 30,

Â it just leaves everyone else in the dust.

Â N squared keeps up a pretty sizable advantage though against everyone else.

Â N log n and n also are pretty well separated from the others.

Â In this graph, root n and log n seem to be roughly equal to each other, but

Â if you kept extending, if you let n get larger and larger,

Â they'd very quickly differentiate themselves.

Â Square root of 1 million is about 1,000.

Â Log of 1 million is about 20.

Â And so really as you keep going out, very quickly the further out you go

Â the further separated these things become from each other,

Â and that's really the key idea behind sort of asymptotics.

Â We don't care so much about the constants,

Â we care about what happens as your inputs get very large, how do they scale.

Â