0:00

>> . In this module, we will be walking through

Â a review of linear optimization using a hedging example.

Â We are going to see how there are two kinds of optimization problems, something

Â called a primal optimization problem, something called a dual optimization

Â problem, how they are related. And we also introduced the idea of La

Â Grangian relaxation. Here's the hedging problem that we are

Â interested in. I've got d assets.

Â So take d, just to fix ideas, take d to be 3 or 4.

Â But some number of assets that are there in the market.

Â The price of these assets at time 0 is r to the d.

Â It's some price vector, which has d components, where every component tells me

Â the price of that particular asset. At time t equal to 1, the market outcomes

Â are uncertain. I don't know what, which state the market

Â is going to be, it's going to be in one of m possible states.

Â So here's the situation at time T equal to zero, and I'm in state zero, And I could

Â go to one of n different possible states. Go from one, two, all the way to, down to

Â m. What do I mean by a state?

Â For my purposes, state is simply telling me what are the different prices that can

Â happen. So I'm going to characterize every state

Â by the prices of the asset in that particular state And I can do it in 2

Â different ways. I could either tell you, what is the price

Â of all the assets in any given state, Or I can tell you the price of a given in all

Â possible states and we'll flip between these two ideas as we go through.

Â We'll use the power of matrices to see what it means, sometimes, it's easier to

Â represent it one way, Sometimes it's easier to represent it the other way.

Â And understanding it both ways gives us an insight of which one is more beneficial

Â for the particular application that I'm looking for.

Â So, to motivate that, I'm going to define something called Sj.

Â So S sub j, will be a column vector, and it will tell me the price of asset j in

Â all possible states, So it's s1j, s2j all the way down to smj.

Â J is fixed, it's asset j and the number of states would ran-, go from 1 to m.

Â Now how do I represent all the assets in the market?

Â I'm just going to take these column and stack them.

Â S1 would refer to the column corresponding to S, asset one S2, asset two and so one

Â up to S D. If I write them out in gory detail, you

Â end up getting this matrix. The matrix has M rows, because it's got M

Â states, its got D columns because its got D assets.

Â And what going on here, every row here, tells you the price of all the assets in a

Â particular state. So what I've circled here, are the prices

Â in state 1, S1d, 12S, 11S, 12, all the way up to S1D.

Â Genetically, somewhere it's going to be Sm1, Sm2 , all to Smd.

Â So every row tells me what happens to all the assets in a given state, every column

Â tells me what happens to a particular asset in all the different states.

Â Alright, so that's how I'm going to describe my market.

Â At times 0 I know the price, at time 1, which is the place where the market opens

Â again, I don't know the price, it's uncertain, it's going to be one of n

Â possible states, but I know that if the state is given to me, what the prices are.

Â Now you might think that this is a very simple representation of the market, and

Â indeed it is. But it turns out, that even in the most

Â modern methods, of risk management, like value at risk, and conditional value at

Â risk, people represent what happens to the market By a model which looks very much

Â like this. Except instead of talking about states,

Â we'll talk about simulations. I'm going to simulate returns and I'm

Â going to say, depending on all of these different scenarios, what happens?

Â But essentially the, the main ideas are going to be captured by this simple toy

Â model. Alright, I know what happens to the asset.

Â Now, what I want to do with these assets is to hedge an obligation.

Â I'm going to walk through what does hedging mean.

Â So, I have an obligation. What is an obligation?

Â It's a vector x in rm. Why m?

Â Because the obligation depends on the state.

Â In a good state I might had to pay more, in a bad state I might had to pay less.

Â So depending upon which state occurs, I'm going to pay an obligation xi, if this

Â state i occurs. And what I want to do, is I want to buy a

Â portfolio now, I want to buy a certain number of shares of the assets right now,

Â in order to have enough money to pay my obligation.

Â So, I'm going to choose a portfolio theta. Theta-1 through theta-d are going to be

Â the number of shares that I'm going to purchase of each of these assets.

Â I'm going to allow for the possibility of short selling, so thetas could be

Â negative, or I'll buy them long, which is, when thetas are positive.

Â And I want to do, I want to choose this portfolio, so I can hedge the obligation

Â that, That I'm interested in hedging. So I'll step you through what it means and

Â what, what hedging will ensue and then we'll go to what is the linear

Â optimization problem that we end up getting.

Â Okay! So at time 0, I'm going to put, buy a

Â position theta at rd. Why d?

Â Because it's got d different assets. So theta J's the number of shares of

Â assets J that I purchased, where J goes from 1 through d.

Â What's the cost of this purchase? The cost of the position theta, is simply

Â the price of every asset times the number of shares that I produce.

Â A lot of those assets! So it's pj times theta j, sum from j equal

Â to 1 through d. It's the inner product of the vector p,

Â with the vector theta. And we know that inner products are

Â nothing but p transpose times theta. What happens at time t equal to 1?

Â So, if a state i occurs, then I'm going to liquidate my position.

Â And when I liquidate positions, I'm going to sell the assets.

Â And what's going to happen? In state i, the price of asset j is s, i,

Â j, I hold theta j positi-, shares of this asset, so by selling asset j I get Sij

Â times theta j But I'm going to liquidate the entire portfolio.

Â So j I'm going to sum from 1 equal to d and that's the amount of money, that's the

Â pay-off that I'm going to get in the stake.

Â Its just Sij time theta j sum from j equal to 1 through d, That gives me yi.

Â And if you just think about it for a moment, if I stock up all the pay-offs as

Â a vector. If I call a vector y to be y1, y2, all the

Â way up to y m, the pay-offs in the m states This is nothing but the matrix S,

Â Times the vector theta. Just to remind you, this is my matrix S,

Â it has prices for every state as rows. So the payoff in the first state would be

Â this row times the vector. And the second state would be this row

Â times the vector. And that's exactly what is being done

Â here. Sij, as j goes from 1 through D, is a row.

Â You take the i'th row, multiply it by the vector, you get the payoff in the i'th

Â state. And therefore, the vector y is simply the

Â matrix, times theta. Now I want to look at this, in a different

Â way. I want to think of this, as not

Â multiplying row, row by row but column by column.

Â So, instead of looking at this matrix S, as row by row as we have done so far, I'm

Â going to put them in columns. I've got the column for the first asset,

Â column for the second asset, column for the d asset and, that's going to multiply

Â theta-1 theta-2 up to theta-d. And you can see that this multiplication

Â is nothing but, this row, times this column, so you'll get theta j times Sj.

Â J is summed from 1 through, that should not be an n, but a d.

Â It goes from j equals 1, through d. Interpreting it this way, you end up

Â getting a different interpretation of what is going on.

Â Now, the pay-offs ys are nothing but linear combinations of the columns.

Â So vector y, will, I can represent, I can generate a particular pay-off, if it's in

Â the range of the matrix S. Remember in the concept, we had introduced

Â this concept in the model of matrices, that a vector y, belongs to the range, of

Â s, if these are all vectors s-theta, where theta is in, r to the d.

Â And therefore, y belongs to the range of S.

Â And remember, we had also introduced the notion of rank.

Â We had said, that rank tells me how rich that space is.

Â Can I, how many different vectors can I generate in the range?

Â So if the rank of the matrix S is n, is equal to m, meaning all possible pay-offs

Â can be generated, then I can hedge everything.

Â If on the other hand, the rank of the matrix s, is less than m, that means there

Â are certain pay-offs. There are certain pay-off vectors that can

Â not be generated because they can not be produced as if I'm make, taking the matrix

Â S and multiplying it to theta. So that's the concept that's going to play

Â a role in the course.We're going to talk about complete markets and incomplete

Â markets, and that has to do with the rank of this matrix.

Â Before we get there, here's a simpler notion.

Â We'll say that a pay-off y hedges x, if y is greater than or equal to x.

Â Component by component, the pay-off that I've generated using my portfolio is

Â greater than the pay-off that I need to hedge or give-, the payoff that I need to

Â give at time 1. Alright, now comes our first optimization

Â problem. What's the optimization problem?

Â I want to minimize the price of the portfolio such that it hedges my

Â obligation. I want to minimize p transpose theta, such

Â that s-theta is greater or equal to x. And, what are some features about this

Â optimization problem? The objective!

Â What I'm trying to maximize or minimize is a linear function, linear function of

Â theta. The constraints!

Â Constraints, and what thetas that I can choose, are again linear constraints, as

Â theta must be greater than or equal to x. Linear or inequality.

Â So any optimization problem, which has a linear objective function, either

Â minimization or maximization it doesn't matter, and all the constraints are either

Â linear inequalities, as is the case here, or linear equalities.

Â We will call that problem, a linear optimization problem, or a linear program.

Â It turns out that linear programs are very rich, there's a rich theory about them,

Â and you can do a lot of interesting things with them.

Â You can model lots of problems, you can solve them very efficiently, you can get a

Â lot of interpretation out of them. So the one thing that we're going to focus

Â on in linear optimization, and the interpretation of linear optimization is

Â the notion of duality. And what do I mean by the notion of

Â duality? For every linear program, I can write

Â another linear program which is intimately connected to it, and this connection is

Â called duality. So I've got a linear program here minimize

Â x, c transpose x, Ax greater than or equal to b.

Â Here's another linear program. Maximize B transpose U, A transpose U

Â equal to C and U is greater than or equal to 0.

Â Min goes to max. Now, for the purposes of this course, you

Â will not be responsible for how I generated the dual linear program from the

Â primal linear program, or the second linear program from the first linear

Â program, we will give you that in the course.

Â The only thing that you're going to be responsible for is, to understand the

Â relationship, and we'll emphasize this again during the course.

Â They're here from-, some of the interesting resolve that comes from this

Â duality concept. The first thing is something called weak

Â duality. What it says is that this minimization

Â problem. So the way I, the picture that I have, I

Â wanted to keep in mind is that here's my value P.

Â For all feasible values, for all xs, such that Ax is greater than or equal to wha-,

Â b, I get some numbers on this side. Why do I get greater?

Â Because I'm trying to minimize. So I get the lowest possible number is p.

Â I've got another number d. Which is the value of this second linear

Â program down here. For all Us that are feasible, meaning that

Â satisfies A transpose U equal to C, and U is greater than or equal to 0, I'll get

Â values that are less than this D. Why?

Â Because this is a maximization problem. The largest possible thing that I can get

Â is B transpose U, is going to be equal to D.

Â The first theorem says, that in fact, this picture that I'm drawing is correct.

Â That P is going to be greater than D. There's no reason for it to be that way,

Â they could have crossed. But the nice thing is, if you construct

Â the dual linear program correctly, and in the next slide I'm going to show you a

Â simple example. Again let me emphasize you're not

Â responsible for knowing it. Just walking through the exercise, and

Â understanding how I'm going to use the duality.

Â So the primal and the dual are intimately connected; the optimum value of the primal

Â P is greater than equal to the optimum value of the dual D.

Â And because this is true, you end up getting a chain of inequalities that are

Â very good. We know that c transpose x is greater than

Â equal to p. Why?

Â Because I''m trying to minimize c transpose x, so for any feasible x,

Â anything that satisfies the inequality Ax greater than equal to b, I'll get a value

Â greater that b. The second piece is also true, D is

Â greater than equal to b, transpose u because I'm trying to maximize b transpose

Â u. And the inner part comes because of the v

Â duality. So now this gives you a very interesting

Â way, if I can find an x, and I can find a u, such that c transpose x is close to b

Â transpose u, then I know that the x must be very close to optimal, And u must also

Â be very close to optimal for the dual. Why?

Â Because P is greater than, equal to D, if these two at, points are very close, it

Â must, it must be that P is also very close to c transpose x, which means that it's

Â optimum. Similarly, b transpose u must be close to

Â D, which means that is optimum. You end up, you can go one step further,

Â and say that when either P, or D, is finite, either the primal value is finite,

Â or the dual value is finite, then in fact, they must be equal.

Â And finally, the reason why we call them dual, is because if you go from the primal

Â to the dual, and from, you take the dual of the dual, you get back the primal, and

Â that's why they're called dual linear programs.

Â Going a little bit further. Here's another pair of primal dual linear

Â programs that we'll be, that we'll be using in the course.

Â Minimize over x, c transpose x, Ax equals b, they, it's equal to maximize u, b

Â transpose u, A transpose u equal to c. And this equality, I'm putting it there to

Â emphasize the fact that there is strong duality between them.

Â But to keep in mind that this equality holds only if you can show that either

Â this one, p or this one, b is finite. So if p, or d is less than, is less than,

Â is not equal to let's say, plus infinity or minus infinity, in that case these two

Â values are the same, and in that sense they are equal.

Â So, the last piece in this module, we're going to walk through and tell you how to

Â construct a duel, it's a very general concept called La Grangian relaxation.

Â And we'll use that in the next module on non linear programming, as well.

Â So here's the problem. The primal problem is, minimize c

Â transpose x, Ax, greater than or equal to b.

Â Now I'm going to take a vector u, which is greater than or equal to zero.

Â And so u is component wise greater then or equal to 0, and remember Ax is going to be

Â greater than equal to b. So Ax minus b is component wise greater

Â than equal to 0, you take some vector which is component wise greater than equal

Â to the oh, multiplied to another vector that is component wise greater than equal

Â to 0, you end up getting a number, which is greater than or equal to 0.

Â So I'm subtracting it, so I'm getting a number, which is less than c transpose x.

Â So this linear program, which has a changed objective function involving this

Â vector u, is going to have a value, which is going to be less than equal to P.

Â B transpose U does not involve the decision x, it does not involve the

Â minimization, the minimization is going on over x, it does not involve x, so I pull

Â that out. I end up getting a new problem, which is

Â minimize c minus a transpose u times x. And what happened to this constraint?

Â I threw it away! Why did I throw it away?

Â It's complicated. I don't know how to deal with constraints,

Â so I threw it away. But I'm guaranteed that if I throw away

Â these constraints my set over which I can optimize, the xs over that I can chose

Â become larger, and therefore this minimum only becomes smaller.

Â So I end up getting that this quantity, is going to be smaller than the previous

Â line. Now, because I don't have any constraints,

Â I have a very simple problem. I've got some vector, let's call this

Â vector d. I've got d transpose x.

Â So I want to minimize d transpose x. So here's my d vector.

Â I want to minimize this, d transpose x, and I can choose my x to be anything.

Â So what am I going to do? I'm going to choose my x to be in the

Â negative direction and going off to infinity.

Â Here's my b, here's my x! If I multiply d and x together I get a

Â very large negative number. So if I have vector d which is not equal

Â to 0, I can make this optimization problem equal to minus infinity and that's what

Â this says. If d is not equal to 0 and, just to

Â emphasize d's equal to c minus A transpose u, if this is not equal to 0 you get to

Â minus infinity. If in fact, d is equal to 0, which means

Â that c is equal to A transpose u, then I can't do anything over here.

Â This vector is equal to 0, I multiply to any other vector, I'll get a 0.

Â So you end up getting that this minimization problem has a value equal to

Â 0. And p is greater then equal to b transpose

Â u. Now u is arbitrary, the only thing that I

Â needed to do was, which is missed over here, is that I needed to have that u must

Â be greater than equal to zero. So now, you have p must be greater than

Â equal to maximize b transpose u, which is the value here, provided A transpose u is

Â equal to c and u is greater than equal to 0.

Â That immediately gives you a weak duality, a little bit more work gives you a strong

Â duality. So here's the connection.

Â Max-, minimize C transpose x, Ax greater than or equal to B is equal to maximize B

Â transpose U, A transpose U equal to C, and U greater than equal to 0.

Â That's what we derived over here. What we did, we dualize constraints, and

Â this word will show up some times during the course.

Â Dualize means I take the constraints and multiply them by a variable which has a

Â particular sign. We had a constraint Ax minus b, I

Â multiplied by a vector of u which is greater than equal to 0, that's

Â dualization and then I'd relax the constraint.

Â I have this constraint Ax greater than equal to b, I didn't like it, it was too

Â complicated, I threw it away. I'd relaxed them!

Â And by doing dualization and relaxation, you end up getting something called a La

Â Grangian relaxation, which gives you duals, And gives you some very nice

Â properties that we'll explore more in the course.

Â