Also, the space requirement is also O of nm.
This is just because we need to store a 2D table.
So for spell checking programs, where we compute the edit distance
between two short words, the space requirement is not an issue at all.
It might, at the same time, be an issue in biological
obligations where we compute the edit distance between
sequences consisting of thousands or even millions of symbols.
So in this case,
storing this table would require us to allocate gigabytes of memory.
The good news at the same time is that it is not too difficult to
tweak our algorithm so that to reduce the quadratic space
requirement to just linear space requirement.
For this, note the following property.
When we filled in the table, for example, row by row, so
each time when we compute the value of a particular cell,
it totally depends on the values in this row, namely,
from the left neighbor and also from two values in the previous row,
namely, from the diagonal neighbor and from the top neighbor.
This suggests that instead of storing the whole table,
it is enough actually to store just two rows,
the current one that we are currently filling in, and the previous one.
So instead of storing all n rows, it is enough to store just two rows.
And the same trick actually applies to columns.
We can fill in the table column by column.
And instead of storing all possible columns,
we store just the two current columns, okay?
So this allows us to reduce the space requirement from O(nm) to O(min{n,m}),
so from quadratic to linear, okay?
It should be noted, however, the same time that for
reconstructing a solution, we actually need the whole table.
Because we need to trace the path from this corner of the table to that corner.
So it is not so easy to reconstruct the solution without storing the whole table.
But it is known how to do this, and
the corresponding algorithm was designed by Hirschberg.
So you can Google for Hirschberg algorithm, and
let me just give you the underlying idea.
In order to find an optimal alignment using only a linear amount of space,
we're going to do the following.
We're going to compute the edit distance between the corresponding two strings
in the same fashion as we did in our previous part.
At the same time when computing this, we're going also to compute the value k,
such that in an optimal alignment, the first half of
the word a is aligned to the first k symbols of the word b, okay?
Which also means that the second half is aligned to the last
m minus k symbols of b.
So we will first compute this value of k.
And then we will make two recursive calls, in order to compute the optimal
alignment for the first half of a and for the second half of a.
So in a sense, this is a nice combination of dynamic programming and
divide and conquer techniques.
So in general, it gets an algorithm working in, more or less, the same time,
so O(nm), and that uses only a linear amount of space, and
also is able, not only to compute the edit distance between two strings,
but even an optimal alignment, okay?
So, the last remark is that our algorithm can be
also easily transformed into an algorithm that computes
the way that edit distance between two strings.
So in this case, we do not read all aberrations as a unit cost operation.
So we might want to assign different cost to different operations.
For example, different costs to insertions, deletions, and
different costs to substitutions of some periods of samples.
So why we might want to use this?
Well, first of all, in spelling, in spell checking applications, it is natural
to assume that some substitutions are more probable than some others.
This of course depends on our keyboard layout.
Also in computational biology,
it is also the case that time substitutions are more likely than others.
Unfortunately enough,
our recurrence relations can be easily tweaked to reflect this fact.
Namely, instead of adding just one for insertions, deletions, and
substitutions, we need to add the cost of the corresponding deletions, the cost
of the corresponding substitution, or the cost of the corresponding insertion, okay?
So this concludes the part on the edit distance problem.
In the next part, we are going to design algorithms for
the well known knapsack problem.
This is a classical problem in combinatorial optimization, then for
the chain metrics multiplication problem.
And we will then review the whole part on dynamic programming.