And let's start with a question of how to assess the resolution of a single particle reconstruction. The basic answer is, that we'll do the experiment twice and simply compare the results as a function of spacial frequency, how well will they match. Now, in fact, when you do a single particle project, you record a large number of particle images, okay? So let's say these are images of individual particles, there's image number one, there's two, there's three, four, etc., etc., up to image N. The first thing we can do is split these into two equal groups, and split them randomly. And so you get a set of images over here and over two images. And we're going to call this set number one. And then we get a second set over here of the other images also. N over 2 total images. And let's call this set number 2. Well, then we take the first set of images and we use it to generate a three-dimensional reconstruction of our particle. And then we take the second set of images and use it to generate an independent reconstruction. And these two have to be fully independent so that there's no reference bias, for instance, that's causing both of them to look the same in the end. They need to be independently produced. And finally we can calculate the three-dimensional fourier transform Of each of these separate reconstructions. With all of its amplitudes and phases. And now, we can ask. How well does this reconstruction match this reconstruction as a function of spacial resolution. And we'll plot that. So we'll draw an axis in here, will be spacial frequency. And this axis will be, let's call it for now, similarity. And we'll consider regions in the transforms of these reconstructions. We'll consider shells. So a shell of amplitude in phases here were compare to the corresponding shell in amplitudes in phases here. And we'll do this pixel by pixel. And so if we started the center at lower resolution and we compare those amplitudes and phases to these amplitudes and phases. Probably they'll match pretty well. And so, it'll start at high similarity. Then as we move out into larger and larger cells and prepare them, the similarity will likely fall. And so, here as we get to a larger shell it will continue to fall. And eventually we'll be comparing a shell of amplitudes and phases at very high resolution. And at that point the similarity will probably fall to insignificant, meaning that those two reconstructions are no longer reflecting the same features at that high resolution. Now the measure of similarity that is most commonly used. Is called the fourier shell correlation. And, this is defined as the sum over a particular shell. So, we mean, like a shell in reciprocal space of the amplitudes and phases of one reconstruction, and their dot product with the amplitudes in phases of the second reconstruction. And so, to do the dot product, you multiply by the complex conjugate, and this is summed for all the pixels over the shell. So, for instance, we'll start with one pixel and its amplitude and phase and compare it to the corresponding pixel. In the other reconstruction and then move on to the next pixel and next pixel, all within a particular shell. And we'll consider just the real part of that dot product and then divide this by the sum over the shell of basically the power in that shell of the reconstruction times the sum over the shell of the power in the second reconstruction, and the whole denominator will take the square root. And let's look in a little bit more detail of what this formula is really doing. So let's imagine that we were to plot the amplitude and phase of this pixel and the amplitude and phase of this pixel and an argon diagram representing them as imaginary numbers so now this is the real axis and this is the imaginary axis. And so we would have one vector here that represented the amplitude and phase of that pixel in reconstruction number one. And it has an amplitude, it's length, and a phase, the angle here. But we can also represent that as a real component and an imaginary component. So that F1 = a1 + ib1. And then the amplitude and phase of this pixel, here I'll color it blue and then draw this in blue. Let's say this is over here. This is F2. And we'll decompose it also into a real complement, a2, and an imaginary complement b2, so that F2 is equal to a2 plus ib2. Now the FSC, or the contribution to the FSC of that single particular pixel in the two reconstructions becomes a1, a2, plus b1, b2, divided by a1 squared plus b1 squared. Times A2 squared plus B2 square root. Now in the very special case so we'll say if F1 is equal to F2 then the F at C reduces to A squared plus B squared Divided by A squared plus V squared, squared, square root. Which of course is just one. And so when the two pixels are the same, if we now replace similarity with four a shell correlation, the highest correlation Possible is one. And high correlations are typically seen in very low resolutions. But as one moves out away from the origin and reciprocal space to higher and higher resolutions when you consider pixels far away from the origin and compare them to their counterparts in the other reconstruction In this situation, F1 might be as shown here, but then F2 might be very different. Imagine if the F2 pixel was all the way over here, had a phase 90 degrees away from F1. In this situation, now A2 is actually negative and, B2 is as it was. And, if A2 is negative with respect to A1, then the numerator here, A1 X A2 Will actually be a negative number added to b1 times b2 as a positive number, ad so the numerators reduce to close to zero. And so you see the FSC curves typically start high and then they fall to lower values near zero at higher resolution. So looking back on the formula for the FSC,, we recognize the bottom as basically the product of the power in the two different reconstructions within that shell. And the numerator is the dot product between them, which represents how much overlap there is. So we see the whole formula can be understood as a measure of what part of the total power at a particular resolution in reciprocal space, what part of that power really matches between the two half reconstructions. If they match very well, then the features at that resolution must be true, but if they don't match, then they're not reliable. So here's an example from an actual publication from this year. And the FSC is plotted from a high value of 1 down to zero, and resolution is plotted here from zero up to near-atomic resolution at the extreme end here. And the calculated FSC started high and as expected fell to lower values at high resolution. And so if you want to assign a specific number to the resolution of single particle reconstruction, one can ask at what resolution does the Fourier Shell correlation pass below a level that would be expected if no more reliable correlation between the two half reconstructions. And in this case, for this asymmetric object, that value is a Fourier Shell correlation of 0.143. And so where it crossed that line, 0.143, was 3.4 angstroms, and that's what was reported in the paper as the resolution of that reconstruction. Now while the Fourier Shell correlation gives an estimate of the global resolution of the entire reconstruction, ResMap is a method to estimate local resolution within a reconstruction. So here in this figure from the paper describing ResMap is shown a single particle reconstruction in 3D, and then here is a slice through that reconstruction, it's of a ribosome with a 40S and 60S component. And then ResMap looks at each pixel and asks the question what is the most rapidly varying wave passing through this pixel whose signal is above the noise? And so at each pixel, you can assign a resolution here. In this case, it's from 4.5 To 7 angstrom resolution of what is the highest resolution wave that is reliably detected in that region of the reconstruction? And so it's shown as a heat map where this is no resolution, and here the red pixels are something around 7 angstrom resolution. But here in the core of the particle and in the larger sub unit, the resolution reaches to more like 4 and a half angstrom resolution. So you can see different parts of the reconstruction have different resolutions, and this is not unexpected because certain parts of an object are likely to be more flexible. And other parts, for instance, within the dense core of large pieces of the complex will be dominant in the alignment, and so their resolution should come out a little higher. A second example here that we'll show is of an icosahedral virus, here's a section out of the 3D reconstruction. And then here's a single slice through that reconstruction and the resolution heat map showing that the core in the capsid had very high resolution, around 6 angstrom resolution. But the spikes at the tip, which are likely to be more flexible, had lower resolutions. Now let's talk about the factors that limit the resolution of single particle reconstructions. Now there's a lot of factors, all the way from the nature of the particles, to the quality of the images, to our ability to average them to the number of particle images included, but one of the most important factors is often conformational flexibility. Now to illustrate that, let's consider this study published in Nature 2003 of the dining motor. So dining is a motor that can attach to some cargo inside the cell and also to microtubules, and it moves up and down the microtubule pushing the cargo. So when these authors prepared particles and negatively stained them on grids and then took all the images and produced class averages, they found that if they aligned the class averages, for instance, on the dining motor portion, they could put the class averages in some kind of temporal sequence that showed that the stem here that goes from the motor up to the microtubule assumes a continuous range of positions with respect to the motor. Likewise they could arrange the class averages to show that the stock also had exhibited a range of positions with respect to the motor. And so clearly this is a machine that has continuous conformational states rather than a small number of discrete conformational states. And so single particle reconstruction will be limited in resolution by how many times each precise state can be imaged. Nevertheless, this also illustrates one of the great powers of single particle analysis, which is to reveal the conformational flexibility of an object of interest. Another factor that can limit resolution is if the particle of interest exhibits preferred orientations on the grid. There's a number of reasons why this might happen. First of all, if your grid has a continuous carbon surface, there could be some part of the particle that has an affinity for that carbon. And so as the particles engage the carbon, they always stick in one preferred orientation. On the other hand, maybe the particle has a surface that likes the air-water interface so between the water and the air, it sticks to that surface in a preferred orientation. In any case, this makes single particle reconstruction more difficult because you want to have the full range of views, and if you don't, these preferred orientations lead to anisotropic resolution. So one of the ways you can check what kinds of orientations were present in your data set is to plot them. So here's an example from JMB in 2000 where the authors were studying a particle with such asymmetry that just this triangular region of a sphere of views were unique. So this is the asymmetric triangle of possible views of that particle, and each of the particles that they had imaged, they are plotting the point of view from which that projection was recorded, and they plot those here in red. And as you can see there is a preferred orientation. Lots of the particles lay on the grid such that they were imaged from the top more or less down the axis, or at least closer to down the axis than from a side view. So in a case such as this one must be careful that the extra images recorded from this perspective are not overweighted compared to the views that were more sparsely represented. Instead the information coming from all the different views has to be equally weighted. And so, among all the resolution limitations we just talked about, how particle homogeneity limits resolution. And we've talked previously about how image quality limits resolution. And just to refresh your memory, we talked about how the contrast transfer function of images, function spatial frequency, oscillates and it's also damped at higher spatial frequency by envelopes. We talked about two envelopes specifically. We talked about an envelope due to spatial coherence of the beam. And remember the point there was that if we had a specimen and we were producing an image And some of the electrons were coming straight down the column, producing their part of the image right there. But other electrons were coming at an angle, they would produce an image of the sample over here. And this one would produce an image of the sample over here, for instance. And so, the result of it would be a smeared image. And so, the low resolution terms are still present with substantial power, but the high resolution terms are damped out by the smearing due to partial spatial coherence of the electron beam. We talked about another envelope. An envelope do to partial temporal coherence. And in that case, the issue was given as a sample that some of the electrons would be focused more strongly than others because they had different energies. And because of that, there isn't a single plain here with a nicely focused image. Instead, no matter which plain you choose to make conjugate to your detector, you'll get a blurred image, because each electron contributes a slightly different focused image. So that was the envelope due to temporal coherence. And again, it affects the low resolution terms much less than the high resolution terms. And the total envelope is the product of these two envelopes. And so, I call that image quality. In addition to these envelopes of coherence, we've also described now beam-induced [BLANK AUDIO] specimen movement. And something we haven't talked about yet, but which is important is the quality of the camera. The ability of the camera to capture the details that are present. Similarly to the contract transfer function, a camera can be characterized by what's called a modulation transfer function, [BLANK AUDIO] or MTF, which again describes as a function of spatial frequency. How much of the signal present at each spatial frequency is actually captured by the camera? Cameras also have distortions, [BLANK AUDIO] aberrations which can degrade the image quality. Now, I'd like to introduce another major category of resolution limitations. Which is alignment precision and point out that if we had a picture of a particle, and let me draw it kind of in cross section, that we were trying to align it with another particle and we couldn't tell exactly where the center of that particle was. And we were aligning it with, say, another image of that particle. And so, we align all these together and in fact what we get is a lower resolution smeared image of our true particle. So alignment errors here produces smearing effect and it turns out that that smearing effect can be described mathematically just like an envelope function that degrades the high resolution information more than the low resolution information. And so, looking back at a CTF function, if without envelopes the CTF with oxalate fully form plus one to minus one. There are a number of envelopes that will degrade high resolution information. Let me, again, put in the envelope due to partial spacial coherence. There's an envelope due to partial temporal coherence. And now I'm going to add an envelope, another envelope. Due to translational errors in the alignment precision, in the precision to which the images can be aligned as they're being averaged. So I'll introduce that as an envelope due to translation errors. In very similar fashion, you can imagine that if we can't precisely determine exactly which orientation we're imaging each particle, so we have errors in those estimates. Then when we merge those particle images in 3D, we will degrade high resolution details because of rotation errors. And these also can be described as an envelope. So there's an envelope due to rotation errors. Likewise, if some of our images have a slightly different magnification than others. And we can't detect that precisely. If we average an image of a particle with 1% higher magnification than another particle from another image, say, in a different corner of the image, then the high resolution terms will be degraded again. So there's an envelope due to magnification errors. Further, if we look at a sample in cross section. So here's the water and here's individual particles present in the vitreous ice. And we image it from above, of course, and determine a defocus for that image. We might determine, as the defocus, some average defocus value through the center of the ice. And while that defocus might be a very good estimate for this particle, this particle could have a higher defocus and this particle could have a lower defocus. And so, there's a defocus spread here what's called a delta Z within the system. And if we CTF correct, the images of each of these particles with that single defocus representing the middle, this will introduce errors just like the errors that arose from electrons having subtly different energies. So that each electron focused at a different position. And the net image was a smear of images with different focus. Exact same principle applies if the particles are at different heights within the ice. And that the defocus of each particle is not determined precisely. And just like the envelope due to temporal coherence, if one particle truly has a CTF edge shown here in purple, and we try to estimate its defocus and we are wrong by a little bit. And so, when we see TF corrected, we use a CTF say like this, which is not exactly what was actually present in the microscope. You see that it doesn't have much impact at low resolution but in the higher resolutions then were CTF correcting it wrongly and the net result is another envelope function. So let me describe that envelope function as an envelope due to defocus errors. And it can be shown that these envelopes, due to translation, rotation, magnification errors fall off as a function of E, E to the negative some constant and then spacial frequency squared. The envelope due to defocus errors is just like the envelope due to temporal coherence and that falls off as E to the minus some constant to the fourth power of spatial frequency. And so here just like the envelope due to temporal coherence, the envelope due to defocus airs falls off more steeply here because it goes as spatial frequency to the fourth power. And so these are the major envelopes that arise due to errors in the precision of the alignment of particle images. And of course the fundamental resolution limitation is radiation damage. And the way we overcome radiation damage is to image more particles. And so the number of particles averaged is another fundamental resolution limitation. Now the geometry of 3D reconstruction dictates that the number of views that are necessary is equal to pi times the diameter of the particle of interest times the spatial frequency that will be obtained. But to estimate the number of images that will actually be required, that has all the same terms as the number of used. But then we can add another term, the signal to noise ratio desired at that spacial frequency k divided by the signal to noise ratio present in an image. And this term is squared, because in order to increase the signal to noise ration in an average by a factor of two say, one has to average four times as many images together. So this term is squared, now we can further elaborate that equation as the number of images required is pi times the diameter of the particle times the spatial frequency or the resolution is going to be obtained times the signal to noise ratio desired at that spatial frequency. But now we can expand this term to be the signal to noise ratio present in an ideal image damped by all the envelope functions. So we can write that there's an envelope function to the image quality. And by this I mean the envelope due to image quality would be the product of the envelope due to spatial coherence times the envelope due to temporal coherence times envelopes due to other things, such as beam induced specimen movement, the MTF of the detector, etc. There's also an envelope here due to alignment precision. So here the envelope due to alignment precision is the product of all these individual envelopes, an envelope due to translational errors. An envelope due to rotational errors, an envelope due to magnification errors, an envelope due to defocus determination errors. And I'm running out of space here on the screen, but you could also describe the effects of particle heterogeneity as an envelope because the effects are worse at higher resolution than lower resolution. And this entire term then is squared. Now of course these envelopes are related in that, for instance, if your particles are in-homogeneous, then it's going to be much harder to align them, they'll have bigger alignment errors. If your image quality is degraded, then the alignment errors get bigger. And all the effects of all the envelopes multiply to degrade the signal to noise ratio that you're recovering in your average. And the only way to overcome that is by recording more and more and more images. Early single particle reconstructions used thousands of images. As the field developed, and importantly, as computer power increased In the 90s say, reconstructions began to be produced using in the millions of particle images. In one study I'm aware of, actually involved 10 of the 7th individual particle images. Fortunately for the field, the image quality has sky rocketed In recent years due to better quality microscopes, better detectors, better understanding of beam induced specimen movement, better grids, etc. And because of this, the most recent single particle reconstructions that are reaching near atomic resolution and allowing models to be built are involving again just say 10 to the 4th to 10 to the 6th particles, which with automatic data collection software can be recorded in just a day. And a recent trend in the field is the realization that an important way to get to high resolution It's not just to add more and more images. But to effectively sort out only those images that belong together, that represent a coherent class with the same confirmation and hide quality images. And so you get to a higher resolution with a small number of grain images of a homogeneous population, then you do by piling on millions and millions of lower quality images or of images of particles in subtly different conformational states. But another bottom-line practical take home message from this kind of analysis is that we can look at what kinds of alignment errors are tolerable if one still hopes to generate a near-atomic resolution reconstruction. And we can conclude that the standard deviation of translational errors the precision to which we can detect the center of each particle, it's got to be better than approximately one Angstrom In order to preserve the high resolution detail that will be needed. The standard deviation of rotational errors, how finely can you estimate the orientation of each particle. This needs to be on order one degree or better. It depends on the size of the particle, of course, but that's a target precision that one needs to reach. And a standard deviation of defocus error that still allows a near-atomic resolution reconstruction. Needs to be less than about 200 Angstroms but with high quality images, these targets have been achieved in many cases now. Near atomic resolution reconstructions. So let's assume that in your project, you have achieved a reconstruction with near atomic resolution. If that's the case, then you can move on to build a model of your object. Now, in many cases this begins by fitting a known structure into the reconstruction. And there's various software packages to do this. Some popular ones are Situs, Chimera, Modeller, or Sculptor. And as long as your reconstruction has enough detail to show you unambiguously how to fit a high resolution structure in there, for instance a crystal structure, then this is pretty straightforward. Sometimes, however, it's observed that the known structure doesn't fit exactly into the Reconstruction. And in this case, people have used molecular dynamics flexible fitting in order to predict how the known structure was different than the one that was imaged by. To illustrate that process I'll show this movie representing the molecular dynamics flexible fitting that was done in the Klaus Schulten group to fit a model of a ribosome into this Reconstruction. So the movie begins with the Reconstruction in which many secondary structural elements are clearly visible. Then the author's rigid body dock, a crystal structure of a ribosome, into the density. In many places it fits very well. But another place is the match was not so good. So they run a molecular dynamics simulation introducing a new force term that favors conformations in which the structure moves into the areas of high Density. So with this new force term, then it can essentially push the crystal structure into the Map in a way that it would be most comfortable in terms of the forces that exist between the atoms. And this is what is meant by molecular dynamics flexible fitting. But of course, some of the most exciting situations will be those in which there is no known structure. That can either be docked or molecular dynamics flexible fit to the Reconstruction, and in this case you may have the opportunity to build the new structure de novo. And this process typically follows the steps of first predicting the secondary structures that are present. Then searching for those secondary structures in the Map, followed by finding a pathway that best connects those secondary structures, the path of the backbone. And finally, recognizing bulky side chains to anchor the sequence in a particular place. Now, to illustrate this process of finding secondary structures and building models, let's look at an example that was published by Matt Baker et al., in Structure in 2012. It's part of his software that he named Pathwalker, and they started with a single particle reconstruction of the rotavirus at just better than four A resolutions. And so, here, you can clearly see some alpha helices and the pitch is evident. You can see other region that may be beta strands. And so, it's a very high resolution reconstruction. So they begin by taking the whole reconstruction of the asymmetric unit and start by segmenting a particular part of it. That they're interested in. Next, they do secondary structure element detection. So they look through that reconstruction and, for instance, using cross correlation, you can look for any tubes of density or something that would match a beta sheet. So they find the secondary structures that are visible inside the reconstruction. And then, they can take the known, the predicted secondary structure elements in the sequence and try to match which ones would match up with the elements that are visible in the reconstruction. One way to do that is to populate the map with a large number of, what are called here, pseudoatoms, just basically density centers. And then, the computer can analyze many, many, many different possible paths of how you would connect those density centers and it can compare them with the known secondary structural elements in the structure of interest. Plus those that are detected actually in the reconstruction itself to produce an initial model of how the alpha carbon backbone might be connected through those densities. And this leads to a model of the backbone trace and all the secondary structural elements present. And at this point one needs to find particular bulky side chains on this alpha heresies or elsewhere that can anchor the primary sequence into the map at that specific location and illustrate that. Let's look again at the results of this study of a icosahedral virus published in 2010 that went to very high resolution. And in a map such as this in addition to the alpha carbon backbone trace, that can be found through these densities, there are very clear side chain densities emerging from it. For instance, this arginine and even a valine is visible. Here, a tryptophan makes it a very nice big side chain density that could be recognized. A phenylalanine here, and a tyrosine, and so when you can see these bulky side chains on the map, it anchors the sequence into the right location. And once an initial model is built, it can be further refined through, for instance, simulated annealing to a better match the stereochemical constraints. So that's how you generate a model de novo from a cryoEM reconstruction.