[SOUND] [MUSIC] Okay, so now we'll start our theme three, pre-processing. By now you should have a pretty good idea about downloading data, playing with the data, doing some basic transformations and we will now move into something a little more complicated into pre-processing. Pre-processing is very important in itself but it's also very important for people who do not necessarily use pre-processing. For example, somebody who analyzes data and sees certain types of problems in the data. You may actually want to know how the data was obtained and what was the pre-processing scheme so that they can trace back what were the potential problems associated with that. In order to understand a little more about pre-processing we will design our presentation around the pre-processing pipelines, we'll describe what they are, the definition and then we'll do a little more in the details about what are the individual steps of pre-processing. Pre-processing pipelines can be extremely complex but here we'll just focus on the simplest possible pipeline and we will try to understand what are the basic steps. There may be other combination of things in every pipeline including inhomogeneity corrections, skull stripping, registration, and intensity normalization, and these steps can actually be applied in different sequences. The first lecture is the definition and a little bit more details about the pre-processing pipeline. We will start by giving the basic definitions, the basic components, and the pipeline tools. So what exactly is an imaging pre-processing pipeline? It's a collection of transformations from the data as it typically comes from the scanner to a particular form that can be usable for analytic purposes. There are many many different types of pipelines and they can vary by person, they can vary by lab, they can vary by location, so it's really important to understand what's the structure of the pipeline and whether pipelines are comparable across these various types of variables. But just to get a basic understanding of where we are and what we would like to do and what we would like to cover here. We define image pre-processing in several steps. In this case, four distinct steps. The first one is inhomageneity correction. We'll discuss spatial interpolation, skull stripping and spatial registration. We'll not go over too much of the spatial interpolation because that will be folded in the registration. But we'll cover the other three components in quite a little bit of detail. So the pipeline is actually the choice of a particular sequence of image pre-processing steps. And one of the reasons we started with this course is because we think the pipeline should be scriptable and reproducible. At the end of the day, there should be the raw data and also a pipeline that is available with the raw data together with the transform data in such a way that if later on somebody else wants to try to make the same transformations they should be able to apply the same exact code to exert the same data and obtain exact the same results. This seems to be a basic requirement but its something that, its not done that often and I think it should be done more often in imaging research. We actually believe that it is very important, in addition to what is currently published in papers, to actually publish the pipeline. The pi-labs scripts can be long but there is not a problem with just putting the scripts in a file and posting them on a website and have a link to that particular website. So these are some base requirements that I think should be embedded more and more in papers and research so that not only does it make it easier for others to reproduce. But it makes it easier for the actual laboratory themselves to reproduce their own results and reduce the amount of work when there is a change from one person to another in terms of doing the analytic and building the various pipelines. We think that the current publication standards could be improved, and some basic requirements of how to improve them as we said is to make the actual pipelines publicly available on the same platform. And this goes back to the original ideas that we have about neurohacking. We want to do analysis on pipelines that are on the same platform that is simple as possible and as easy as possible to understand and reproduce. So what exactly is a basic pre-processing pipeline? It starts with the files from the scanner, the DICOM files, and then it goes through the transformation to NIfTI files. We have gone through that quite extensively, and we'll talk a little bit more how to do that in this lecture. Then, the next step is N3 correction, actually it's inhomogeneity correction. In this case, we call it N3 because it's one of the popular ways of doing inhomogeneity, there are many other ways of doing inhomogeneity correction. There's a skull stripping which is removing of the skull or just leaving only the brain tissue mask. And there are also different problems related to coregistration, registration to a template so on and so forth. That last part is a little bit more complicated. We'll touch a little bit in this theme, but we'll also develop a lot more and we will go into some details in theme four in terms of registration and coregistration and various types of approaches that could be used. So in terms of the first part, which is an essential part of the pre-processing pipeline, there is this question about how do you go from DICOM to NIfTI files, and we have already provided this in previous lectures, but there are two basic ways we are currently doing this. One of them is use the dcm2nii in the mricron software. The other way, actually, that we prefer is to use the oro.dicom in R, why, because as I said we prefer to be on the same platform, we prefer to be R and it makes things easier to not go outside. However, should you want to use something outside of R, dcm2nii is one of the standards for doing this type of analysis. Inhomogeneity correction, we would like to get an idea, a little bit more about what is a inhomogeneity correction. Here as I said, you have an N3 correction whether it's N4 correction, five corrections and so on and so forth. If a cell has it's own inhomogeneity correction, so there are many different types of inhomogeneity correction, what exactly is inhomogeneity and what exactly is correction? So let's get a little bit of the intuition about what that means. So here on the left side you have an image, an axial image of the brain. And this is an image that was not inhomogeneity corrected. So if we look a little bit more carefully at it, we see that in that lower left part of the image, the images is actually lighter. The intensities are higher, the numbers are larger in the left bottom part of the brain, however, as you move from the left bottom part of the image to the upper right part of the image, intensities become darker, the shades of gray become darker. So this is something that we don't necessarily want to see in an image, we would like to correct that. That has nothing to do with the biology of the tissues, it has nothing to do with the reality of the brain or the biology so we would like to make sure that intensities are actually comparable. A little bit more precisely so if you look at the grey matter in the lower part of the brain. That gray matter which is close to the skull is actually lighter than the white matter in the top part of the brain and this is something that can create huge problems, in cases where we would like to use intensities to do segmentations. For example, we like to differentiate between white matter and gray matter. There is just no way to differentiate between the gray matter in these two parts of the brain, simply because the intensities overlap with other tissues' intensities. For example, white matter, on the right side of this image you see the bias field. What does that mean? It's actually a representation of what one sees in the left part of the image. More precisely it shows in the left lower corner of the image the intensities are higher which are represented by this whiter areas in the left bottom corner of the right image. And as one moves up to the image, the intensities are actually, the biosfield it gets darker meaning that there is a contrast between the top and the bottom part of the image. These are not the only types of inhomogeneities that could be seen on image. There are many, many different types of inhomogeneities but in general, inhomogeneities are the slowly changing underlying noise that comes probably from the scanner, and it's probably due to the fact that the magnetic field that was originally used to obtain the image was inhomogeneous which led to the inhomogeneity in the image. Standard ways of addressing this problem is to do some sort of smoothing. And here are some ways of actually looking more specifically into how to correct them. So, some simple checks are to run an aggressive smoother over the image. And you have already seen how to do smoothers in this short course. Hence, you can simply increase the kernel size to a level where you are comfortable with. And then look over the images see that whether there is indeed, inhomogeneity correction. To think of definition, is a little bit complex but once you go through it carefully you realize and it's quite useful. So, the take away definition of inhomogeneity, or homogeneity, actually is that what you like, you like the distribution of intensities of tissue classes. They should not depend on the spatial localization of the tissue. More specifically, just to provide the intuition for that, you don't want the white matter intensity in the superior part of the brain to be different from the white matter intensity in the inferior part of the brain. And the same thing for gray matter, in the left versus right part of the brain. You would like the distribution of those intensities. We know that, even though we call it white or gray matter, white and gray matter are not represented by one number, they are represented by distributions of numbers, and what we like the distributions in the various parts of the brain to match each other. The other components that we would like to talk about in this lecture is skull stripping. Skull stripping is exactly what it says. It's stripping the skull from the image, we just simply try to get rid of any non-brain tissue information and be left only with the brain information. So here is an example of what skull stripping looks like. In the left part of the image, you see a brain with the skull on, image A. And image B is the same image, where the skull was removed. So why and when would we be interested in skull srtipping? There are many different reasons for skull stripping, those sometimes it's not necessary. It depends on the problem one works on. One case where skull stripping is quite crucial, is for example, if somebody wants to know what is the brain size of this particular person. Another case maybe when we do registration to a template that doesn't have the skull on. There are certain procedures that work much better on an image that was skull stripped, some segmentation approaches, sometimes if analysis, they actually work better on a skull stripped image than on another image.