>> That was our first step. >> Now, we get a lot of these volumes.

>> So, we basically go into our three-dimensional data, that we got from

the tomograms. >> We found, so we have the

three-dimension data. >> We go and look for all the areas

with the virus, and we segment out those spikes, or envelopes.

>> Sometimes that's, that done by hand. >> Basically, somebody goes and places

a box around each one of them. >> Sometimes you can do that

automatically if you have a model of what, more or less, you expect.

>> Then you travel around your image with that model.

>> And you do, a distance at every place.

>> And if the distance is small, you say, oh.

>> It looks like there is 1 of these envelopes here.

>> It's very similar to what we did, we did with non local means.

>> In non local means, we were looking for similar patches.

>> Now, the patch is a model that you have and with it, you go wrong.

>> So, either one of those ways, either manual or with this semi-automatic, you

get this volumes. >> This volumes can be like a

100x100x100 pixels representing, basically, the envelope inside this spike

that we have. >> Now remember, these are noisy,

they're not all the same. >> We need to group them, we need to

align them, and that's what we're going to basically do.

>> We are going to align them first, and there still very noisy.

>> So the alignment is not going to be great because you're trying to align

things that are very, very noisy. >> but it's a good step.

>> You can actually, only if you want, align those that are.

>> Basically, it put a very, very high threshold and very high level of

confidence in the alignment. >> And you say I'm going to align them.

>> Only those that basically after I align, the distance is almost zero.

>> They're almost identical. >> So, then I have more confidence that

I have than. >> Good alignment.

>> Once again alignment is you start and just rotate them until the distance

that we show in the previous slides becomes the smallest possible.

>> Once you have them aligned, then classification becomes a bit easier.

>> So you say okay now I have them all aligned.

>> I have them all in a common space. >> Now I want to group them to try to

only put in the same group those envelopes that basically have the same 3

dimensional shape. >> And there are many ways of doing

this classification or clustering that we need there.

>> But let me just illustrate to you one example.

>> So you have all of them aligned. >> And now you have the distance, the

distance that basically led you to the smallest raw, the smallest alignment

rotation. >> So you have, you align them, you

rotate them, you align them, you have the distance.

>> And now, you basically, how do you group them, you.

>> Group all of them. >> So you pick one, and you say, hey,

bring me all my friends. >> Who are my friends? Don't have very

small distance to me. >> That's 1 group.

>> Then you pick another one. >> And say, bring me all my friends.

>> And who are the friends? Those that I go a very small distance.

>> So you basically group them. >> Once you group.

>> If you did everything right, you can basically average them.

>> And we know that the average would reduce the noise.

>> That's what we taught before. >> We explained when we were doing

image denoising. >> So you can kind of denoise them a

tiny bit. >> And after you denoise them a tiny

bit, you can say, hey, let's try to align them again.

>> Now, there are a bit more clean images.

>> So we are going to align them again. >> We are going to cluster again.

>> We are going to align them again. >> And we are going to cluster them

again. >> And you repeat that for a number of

iterations until you're satisfied. >> Either that things are not changing

too much. >> Or basically you say, this is the

best I can do, so I'm going to stick with that, and that's a process.

>> With the final distance, with alignment, clustering, alignment,

clustering, and the big thing here, once again, and we're going to come again and

again to that, if we have a lot of this >> Small boxes.

>> Now, I want to reiterate to you, all this is done in three dimensions.

>> These 100x100x100 cubes, we have like 4,000 of them.

>> Sometimes we have 10, 20,000 of them.

>> So, a lot of computations. >> We need to align everybody with

everybody. >> We need to compare everybody to

everybody. >> We need to do that many, many times.

>> You can speed up these using different, basically, tricks.

>> One is for this particular case, going to Fourier domain.

>> Normally these things are implemented in GPU's and, and very, very

efficient implementations to make them run in hours and not in weeks.

>> So that's what we have that more challenging, the most challenging part,

computationally is this alignment part because we have to basically try all

these angles, and that's normally done in the, in the Fourier domain.

>> For these types of examples, as we say, because of the missing wedge and

because of the, the speed. >> Now, you have an algorithm.

>> You have, you've developed this image processing algorithm, and you're

going to use it to detect the three-dimensional shape of the envelope

in the HIV virus. >> But then you say to yourself, how do

I know if my algorithm works? Nobody has ever seen the envelope at the resolution

that I want to see it. >> So, how can I validate? And this is

very important when you're talking about medical.

>> Imaging. >> Because you don't want to make

mistake and mislead, let's say, a complete vaccine development because you

made a computational mistake. >> So, what you do in most image

processing, but in particular when you're talking with medical imaging, is you go

slow. >> First, you start from what are

called phantoms. >> You create three dimension shapes

that look like viruses from your prior knowledge.

>> You add noise. >> You project them.

>> You make missing wedges. >> So you make them like they were

acquiring the microscope. >> And you run your algorithm for that.

>> You know the grand truth, because you created it.

>> You hope. >> To reconstruct.

>> And to you don't finish that, you don't move up.

>> So, you solve this. >> You say, great.

>> I put different structures. >> Phantoms I created.

>> I rotate, add noise, add missing wedge.

>> I got them back. >> I'm happy.

>> The next step, because it's almost impossible to do a perfect simulation.

>> Of everything that is going in the microscope, is you take particles that,

for some reason, we know their structure. >> Maybe because they are larger than

the HIVs, maybe because they are more rigid.

>> And they could be acquired with other technologies.

>> And one of them is what's called the GroEL, the GroEL.

>> So, you go into your microscope, you put GroEL, you run your algorithm.

>> You get the 3D shape that everybody knows is the right shape.

>> You say, okay, this is as much validation as I can do.

>> Now, I'm ready to do HIV. >> Even here, you can do a lot of

validation. >> For example.

>> You can take these 4000 samples and drop 500 of them.

>> Do your computations. >> Then drop a different 500 at random.

>> Do your computations. >> And you expect to get very similar

results. >> So, still here you have to do a lot

of validation. >> But this is a regular progress:

phantoms, no data, and then you go into the unknown.

>> Which is the HIV. >> Remember, we are looking for these

tiny spikes here, and which are, we're talking about basically, arms strokes