0:00

Let's move on to the second sense in which the generic local search algorithm

Â is under-determined. The main y loop reads, if there is a

Â superior neighboring solution Y, then reset the current solution to be y.

Â But there may of course be many legitimate choices for y.

Â For example, in the maximum cut problem, given a cut, there may be many vertices

Â whose Switch to the opposite group yields a cut with strictley more crossing edges.

Â So when you have multiple superior neighboring solutions, which one should

Â you choose. Well this is a tricky question and the

Â existing theory does not give us a clear cut answer.

Â Indeed it's seems like the right answer is highly domain dependent.

Â Answering the question is probably going to require some serious experimentation

Â on your part. With the data that you're interested in.

Â One general point I can make is that recall when you're using local search to

Â generate from scratch an approximate solution to an approximation problem you

Â want to inject randomness into the algorithm to coax it to explore the

Â solution space and return as many different locally optimal solutions as

Â possible over a bunch of independent runs.

Â You're going to remember the best of those locally optimal solutions.

Â Is going to return, remember, the best one at the end of the day.

Â So this is a second opportunity, in addition to the starting point, where you

Â can inject randomness into local search. If you have many possible improving

Â values of y, choose 1 At random. Alternatively you could try to be clever

Â about which y they should choose. For example, if there's multiple superior

Â neighboring y's, you could go to the best one.

Â For example, in the maximum cut problem, if there's lots of different vertices,

Â who've switched to either group, will yield a superior cut Pick the vertex

Â which increases the number of crossing edges by the largest amount.

Â In the traveling salesman problem, amongst all neighboring tours with

Â smaller costs, pick the one with the Minimum over all costs.

Â So this is a perfectly sensible rule about how to choose y.

Â It is myopic, and you could imagine doing much more sophisticated things to decided

Â which y to go to next. And indeed, if this is a problem you

Â really care about, you might want to work hard to figure out what are the right

Â heuristics for choosing y, that work well in your application.

Â A third question that you have to answer, in order to have a precisely defined

Â local search algorithm, is what are the neighborhoods.

Â For many problems, there is significant flexibility in how to define the

Â neighborhoods. And theory again does not give clear-cut

Â answers about how they should be designed.

Â Once again this seems like a highly domain dependent issue.

Â If you want to use local search on a problem of interest It's probably going

Â to be up to you to empirically explore which neighborhood choices seem to lead

Â the best performance of the local search algorithm.

Â One issue is likely that you'll have to grapple with is figuring out how big to

Â make your neighborhoods. To illustrate this point, let's return to

Â the maximum cut problem. So there we defined the neighbors of a

Â cut to be the other cuts you can reach by taking a single vertex and Moving into

Â the other group. This means each cut has a linear number,

Â o of n, neighboring cuts, but it's easy to make the neighborhood bigger if we

Â want. For eaxmple, we could define the

Â neighbors of a cut to be those cuts you can reach by taking you, 2 vertices or

Â fewer, and switching them to the opposite groups.

Â Now each cut is going to have a quadratic number of neighbors.

Â More generally of course, we could allow what single local move to do k swaps of

Â vertices between the 2 sides, and then the neighbor size would be o(n)^k.

Â So, what are the pros and cons of enlarging your neighborhood size.

Â We generally speaking, the bigger you make your neighborhoods, the more time

Â you're going to have to invest searching for in improving solution in each step.

Â For example, in the maximum cut problem if we implement things in a straight

Â forward way If we only allow 1 vertex to be switched in an iteration then we only

Â have to search through a linear number of options to figure out if there's an

Â improving solution or not. On the other hand, if we allow 2 vertices

Â to be switched in a given iteration we have to search through a quadratic number

Â of possibilities whether or not we're currently locally optimal.

Â So bigger the neighborhood generally speaking the longer its going to take to

Â check whether or not you're currently locally optimal or whether there's some

Â better solution that you're supposed to move on to.

Â The good news about enlarging your neighborhood size is that you're going to

Â have only fewer local optima. And in general, some of these local

Â optima that you're pruning are going to be bad solutions that you're happy to be

Â rid of. If you look back at the example I gave

Â you in the previous video that demonstrated that the simple local search

Â for maximum cut can be off by 50%. You'll see that if we just enlarge the

Â neighborhoods to allow 2 vertices to be swapped in the same iteration, then all

Â of a sudden on that 4 vertex example, local search would be guaranteed to

Â produced the globally maximum cut. The bad locally optimal cuts have been

Â pruned away. Now, even if you allow 2 vertices to be

Â swapped in a given iteration, there's going to be more elaborate examples

Â showing the local search might be off by 50%, but on many instances allowing this

Â larger neighborhood will give you better performance from local search.

Â Summarizing one high level design decision you should be clear on in your

Â head before you apply the local search heuristic design paradigm to a

Â computational problem is how much do you care about solution quality versus how

Â much do you care About the computational resources required.

Â If you care about solution quality a lot and you're willing to either wait or

Â you're willing to throw a lot of hardware at the problem, that suggests using

Â bigger neighborhoods. They're slower to search but you're

Â likely to get a better solution at the end of the day.

Â If you really want something quick and dirty, you want it to be fast, you don't

Â care that much about the solution quality, that suggests using simpler,

Â smaller neighborhoods. Neighborhoods that are fast to search,

Â knowing that some of the local optima you might get might not be very good.

Â Let me reiterate, these are just guidelines, these are not gospel.

Â In all computational problems, but especially with local search, the way you

Â proceed has to be guided by the particulars of your application.

Â So make sure you code up a bunch of different approaches, see what works.

Â Go with that. For our final two questions let's

Â supposed you've resolved the initial three and you have a fully specified

Â local search algorithm. So you've made a decision about exactly

Â what your neighborhoods are, you figured out the sweet spot for you between

Â efficient searchability and the solution quality you're likely to get at the end

Â of the day. You've made a decision about exactly how

Â you're going to generate the initial solution.

Â And you've made a decision about how, when you have multiple neighboring

Â superior solutions, which one you're going to go to next.

Â Now, lets talk about what kind of performance guarantees you can expect,

Â from a local search algorithm. So lets first just talk about running

Â time, and lets begin with the modest question, is it at least the case, that

Â this local search algorithm is guaranteed to converge Eventually.

Â In many of the scenarios you are likely to come across, the answer is yes.

Â Here's what I mean. Suppose you're dealing with a

Â computational problem where the set of possible solutions is finite.

Â And moreover, your local search is governed by an objective function.

Â And the way you define a superior neighboring solution, is that it's one

Â with better objective function value. This is exactly what we were doing in the

Â maximum cut problem. There the space was finite, it was just

Â the set of exponentially many graph Cuts and our objective function which is the

Â number of crossing cuts. Similarly for the traveling salesman

Â problem, the space is finite. It's just the roughly infactorial

Â possible tours, and again, how do you decide which tour to go to next? You look

Â one that decreases the objective function value of the tour.

Â Total cost of the tour. Whenever you have those 2 properties,

Â finiteness and strict improvement in the objective function, local search is

Â guaranteed to terminate. You can't cycle, because every time you

Â iterate you get something with a strictly better objective function value, and you

Â can't go on forever, eventually you'll run out of the finitely many possible

Â things to try. There is, of course, no reason to be

Â impressed by the finite conversions of local search.

Â After all, brute force search, equally well terminates in finite time.

Â So this class is all about having efficient algorithms that run quickly.

Â So the real question you want to ask is, local search guaranteed to converge, in

Â say, polynomial Here, the answer is generally negative.

Â When we studied the unweighted maximum cut problem, that was the exception that

Â proves the rule. That, there, we only needed a quadratic

Â number of iterations before finding a locally optimal solution, but, as we

Â mentioned in passing, even if you just pass to the weighted version of.

Â Of the maximum cut problem. There already, a local search might need,

Â in the worst case, an exponential of iterations, before halting with a locally

Â optimal solution. In practice, however, the situation is

Â rather different, with local search heuristics often, finding, locally

Â optimal solutions, quite quickly. We do not at present have a very

Â satisfactory theory that explains, or predicts, the running time of local

Â search algorithms. If you want to read more about it, you

Â might search for the keyword smooth analysis.

Â Another nice feature of local search algorithms, is that even if you're in the

Â unlucky case, where your algorithms are going to take a long.

Â In time, before it finds a locally optimal solution, you can always just

Â stop it early. So when you start the search, you can

Â just stay look, if after 24 hours you haven't found for me an local optimal

Â solution, just tell me the best solution you've found thus far.

Â So in addition to running time, we want to measure the performance of a local

Â search algorithm in terms of its solution quality.

Â So, is that going to be any good? So here the answer is definitely no.

Â And again, the maximum cut problem was the exception that proves the rule.

Â That's a very unusual problem that you can prove at least some kind of

Â performance guarantee about the local, locally optimal solutions.

Â For most of the optimization problems in which you might be tempted to apply the

Â local search design paradigm, there will be locally optimal solutions quite far

Â from globally optimal ones. Moreover this is not just a theoretical

Â pathology. Local search algorithms in practice will

Â sometimes generate extrememly lousy locally optimal solutions.

Â So this ties back into a point we made earlier, which is if you're using local

Â search not as a just post processing improving step, but actually to generate

Â from scratch a hopefully near optimal solution to an optimization problem, you

Â don't just want to run it once, because you don't know what you're going to get.

Â Rather you want to run it many times, making random decisions along the way,

Â either from a random starting point, or choosing random improving solutions to

Â move to, so that you explore as best you can the space of all solutions.

Â You want over many executions of local search to get your hands on as many

Â different locally optimal solutions as you can, they you can remember the best

Â one. Hopefully the best one at least will be

Â pretty close to a globally optimal solution.

Â