So now, let's talk about the final and most serious problem, which is that stuff

in the Internet, nodes, links, etc., is failing all the time.

You know its true that we just argued that the asynchronous bellman ford

shortest path rounding protocol is guaranteed to converge to correct

shortest path. that was assuming that the network was

static that no legs were coming up and down.

So one particularly simple problem in a presence of failure is what's known as

the counting to infinity problem. So I'm going to show you a particularly

contrived version of this problem just to kind of illustrate the form in the

simplest way, but it's quite easy to come up with more realistic examples.

And this problem is not just some kind of theoretical flaw in the distributed

Bellmanu2013Ford algorithm. This problem really was observed in

practice with those running protocols from the 1970's.

So imagine we just have a super simple network verticies s, v, and t by directed

arch from s to v and an arc from v to t everything has unit cost and we're doing

it routing to the destination t. You might if you want a more realistic

example, think about the arcs between S and V as standing in for longer paths

that have multiple hops. In any case let's imagine that we

successfully computed shortest paths on this basic network so that distance form

T to itself is zero, from V to T is one and from S to T is two.

Now, again, the issue is that links are failing all the time on the internet.

So, at some point, this link from V to T might go down.

So, at some point, V is going to notice that its link to T has failed and it's

going to have to reset its shortest path estimate to T to be plus infinity.

And in an effort to recover connectivity to t, it makes sense for v to ask it's

neighbors, do they have paths to t? And if so, how long are they?

So in particular, v will pull s, and s will say, oh yeah, I have a path to t of

distance only two. And v says, oh well that's great.

I, currently my estimate to t is plus infinity.

I can get to s with length one, s says it can get back to t with length two.

So that gives me a path of length three to t.

Now, of course the flaw in this reasoning is that s was depending on v to get to t

and now all of a sudden, v circularly is going to use s in its mind to get all the

way back to t. So this already illustrates the dangers

of naively implementing the asynchronous Bellman Ford algorithm in the presence of

link failures. Where does the name counting to infinity

come from. Well you can imagine that for some

implementations of this protocol, S would then say, oh boy, okay, V just revised

it's shortest path estimate to T to be three.

My next hop was to V. So if V takes two longer to get to T then

I'm going to take two longer to get to T as well.

So at that point S updates it's shortest path distance to be four.

And now this keeps going on. At this point V says, oh well, if S is

going to take two longer to get to T, I was depending on S.

So, I have, have to go up by two. And then, S goes up by two and B goes up

by two and so on, forevermore. So failures cause problems for the

convergence of the distributive shortest-path routing protocols.

Not just the counting-to-infinity problem, there are others as well.

And this is a tough issue. Much blood and ink has been spilled over

the relative merits of different possible solutions for dealing with failures.

Back in the 1980s people were largely proposing sort of hacks to the basic

protocol. I will the, give them credit, they had

some pretty ridiculous and fun names for their hacks, like split horizon with

poison reverse." but what eventually people converged on is moving from these

so-called distance-vector protocols to what are called path vector protocols.