Hello. I want to extend a very warm welcome to all of you to the Fundamentals of Digital Image and Video Processing. My name is Aggelos Katsaggelos. I'm a professor here at Northwestern University in the Department of Electrical Engineering and Computer Science. I'm really, very glad to meet all of you in this new, modern way of communication. I'm really, very excited about teaching this class. It clearly represents a major departure from the traditional way of teaching a class. I've been teaching the material we'll be covering in this class for almost 30 years now here at Northwestern and around the world. In all cases however, except maybe a few ones that there was a live feed of the lecture I was giving, the students and I, we were all physically in the same room. Such a situation clearly allows for interactivity. We're all familiar with real time interaction whereby students can stop me and ask questions, and one question brings up another one by another student. And based on the question I start discussing other aspects of the material and so on and so forth. These are all the positive aspects of face to face interaction. A potentially negative aspect of it, however, is that it represents synchronous learning. The underlying assumption is that we are all supposed to learn at the same rate. This is hardly the case. Each of us learns at their own rate based on our backgrounds and so many other factors. This is where asynchronous learning comes in. You have the videos of my lectures available. And you can slow it down or speed it up. And you can go back and listen to parts of it multiple times until something is clear. You can actually skip a segment altogether if you already know the material. You can learn at your own pace, at your own time and place suitable to you. Some of you, you'll be in sunny places and some in cold places with snow. Some of us learn better first thing in the morning, while others late at night. Of course, due to the fact that most probably there are students in all world time zones who will be active 24/7. Now regarding the material, there is probably little need to sell the topic of image and video processing. We are all surrounded by countless applications which make use of images and videos and require knowledge of the material we will be covering in this class. We will be covering, for example, the topic of image and video compression. If compression technology was not successful, you should not be able to watch this video and we would not be able to have this course materialize. You can go back and watch also the promotional video for the course, where we talk about other applications. In any case, I assume you've registered for the course since you either have some background of the topic and you want to enhance it. No pun intended here, since we'll be talking about image enhancement in the course. Or you know that the class covers important material that will serve you well in the future. Or you have a specific problem in mind, and you want to find out if the material we'll be covering here will help you to solve this problem. Or you just have a simple curiosity to find out what image and video processing is all about. My hope and expectation is that, no matter to which of these cases you belong to, and of course there may be other cases, upon completion of the course, you all have met your goals. This is not an easy task and I don't want to mention it lightly. It's not an easy task because we will be drawing material from different areas or topics within science and engineering. Unlike a class at the engineering or computer science department, where the background of the students is rather uniform, in this case, it's safe to assume that your backgrounds vary quite a bit. As we'll see, we'll rely on background from one dimensional signal processing, which we'll extend to two dimensions, calculus, linear algebra, probability, random variables and other processes, and optimization. However, an important point I want to emphasize here, is that the course has been carefully designed so that it takes into account this highly varying background of the students. On one hand, I'll first try to describe the background information we'll be needing before we use it. While on the other hand, I'll try to emphasize the intuition behind each and every processing step we'll be taking. So even if you don't have a background in the topics I just mentioned, I believe you'll leave the course with some basic understanding of the various technologies even if not all the mathematical details are so clear to you and also with some useful algorithms you can make use of right away. Now for those of you with strong backgrounds in the topics I mentioned, I still believe you can find the material informative and challenging. Regarding the logistics of the course, all the information you need is on the webpage of the course under announcements and the various tabs. Regarding the material to be covered in the class, we'll start smoothly this week with an introduction, then we'll spend two weeks on fundamental signals and the fundamentals of two dimensional linear and spatial invariant systems both in the spatial and frequency domain. We'll then spend a week on motion estimation and color, two topics we'll be making use throughout the course. And then we'll look at important applications, namely enhancement, recovery, compression, and segmentation. We'll finish the course describing recent advances in the field, which have to do with a concept of sparcity. So, let's start this week, smoothly again, by parsing the title of the course and defining what is a digital signal and what constitutes processing of such a signal. With respect to the definition of a digital signal, we'll describe an end-to-end system with input and output in the analog worlds, but process and perform in the digital world. This is the topic of this segment. We'll discuss the important concepts of sampling and quantization which allow us to move back and forth between the continuous and discrete, sampling actually, not quantization, both in space, time, and also amplitude. So, let's start and my best wishes to all of you for a successful course. Let us first define what is a signal. A general definition is that it is a function. And one-dimensional, an x(t) for example function. t is the independent variable. Or a two-dimensional, x(t1). t1, t2 are the independent variables that contains information. It tells us something useful about the behavior or the nature of some phenomenon of interest. In the physical world, we can say that any quantity that changes with respect to time and/or with respect to space is potentially a signal. So the speed of a moving car represent a signal. The pressure that I apply on the gas pedal of the car is a signal. If I go and measure the height of the buildings in a city, then this becomes a two-dimensional signal. Tells us at each location what the height is and so on and so forth. Another example here is the price of the oil over the years or the independent variable is time. The horizontal axis here is time and years and the vertical axis is the price of oil in dollars and cents. It goes up as we all know and expect. Similarly, here is an EKG signal and electrocardiogram. Again, here is time on the horizontal axis and the amplitude of a signal on the vertical axis. So a signal like this shows the health of somebody's heart. Signals in systems play a very important role in many areas of science and technology. From communication to aeronautics, astronautics to circuit design, biomedical engineering and so on. Clearly in this class, images and videos are the signals of interests we'll be dealing with. We have analog and digital signals. To put them in some perspective, let's look at the speech generation transmission processing perception system like the one depicted in this figure. So speeches generated by the pressure provided by the lungs that become sound in the glottis in the larynx. Which is then converted into vowels and consonants by the vocal track. Speech is an acoustic signal that is transmitted through sound waves utilizing the vibration of the air molecules. So, here we have an acoustic signal. So such a signal reaches a microphone here, which is a transducer. It converts one form of energy, acoustical energy, into another form of energy, electrical energy. So here I have an electrical signal. Both the signals are analog signals or continuous signals. If I take a kind of toy example here, such a signal let's say looks like this, where here is the time axis and here x(t) is the amplitude of such a signal. This is an analog signal, because the independent variable t, as well as the amplitude x, which is a function of t are continuous numbers. Now I want to process such a speech signal by a digital computer. And this is really the objective of any digital signal processing class. And digital computers only understand zeroes and ones or understand digital signals. And therefore, I need to convert my analog signal into a digital one which is a function of this A to D converter box. In doing so, converting an analog to a digital signal, one has to take two steps. The first step is to sample the signal or discretize it in the time domain. This means that I look at the values of the signal at equally space points. So at value the time 0, the time capital T, the time 2T, 3T, 4T, minus T and so on. So now I have the values of the signal. Here at this time, at this time instance, this time instance and so on. So I have converted my signal into a discrete time signal through the process which is called sampling. The second step is to discretize the amplitude of the signal. This means that I have only certain values available to represent x of t integer value. So let's assume that the values I have available are this one, this one, this one, this one. This means that, so let's call this 0, delta, two delta, three delta. So this means that this value here at 0 will be represented by this value. The second value will be represented by this value. Similarly, this one will be represented by this value. This will be represented by 0 and this one will also be represented by this value, okay? So through this process that is called quantization I have now a discrete amplitude, Of the signal, right? So, the value at 3T here is the quantized value of x at 3T, right? Actually, here will also be the same volume x quantized at 4T, while the actual value here, this is x of 4T, right? And after quantization, it becomes the blue value. So clearly quantization introduces an error. So after both sampling and quantization, I end up with a signal that has both its independent variable. As well as its amplitude being represented by these grid values. And such a signal is called a digital signal Okay, so this signal is an input to the computer which is going to process it and then at the other end, I follow somehow the reverse order. In other words, the signal is turned from digital to analog through the converter shown here. Then this analog signal reaches another transducer. This is also called an actuator that will turn the current into a vibration. And therefore into another acoustic signal which is going to be perceived by the human auditory system. So here again, I have an electrical signal. And this is, and again, an acoustic signal. The objective of a digital signal processing class is to focus onto the central part here, right? To focus on the techniques that would allow us to manipulate digital signals to perform certain tasks that depend on the application. So, we see here that we have two worlds, we have the Analog world here and here, right? So, analog, analog here, and the digital world or the digital domain here in the middle. That is of our primary interest. A similar situation to the previous one is depicted by this broad diagram that shows the generation, recording, processing, and sensing of an image. So electromagnetic energy in the visible part of the spectrum is leaving the Sun reflected by the object, travels through the air and reaches here. A sensor a transfuser which is an analog camera, that through photo-chemistry converts light energy into chemical changes on the film. Some of you might be too young to remember analog cameras that were used in film but there are still some around in existence. So the brighter parts of the image. We have higher concentration of silver grains and this film, after it's processed, negative or black and white film, represents analog image. So again my objective here is to process the image by digital computer and therefore I have to convert my analog image to a digital one through this A to D block. In this particular case, I can use a densitometer that measures the concentration of the silver collide grains on the film or a scanner to end up with a digital image. Of course, nowadays, the digital systems are prevalent which means that these two units here are grouped together and this becomes a digital camera that we are all familiar with. So I guess we should put here that this in here is an analog camera. After the digital image is processed by the computer, and again, this is the main objective of this class. So we'll talk about techniques that will allow us to process digital images. We want to take the processed image and in many cases we want this image to be seen by the human viewer. So the reverse part is followed. Whereby, the digital image is converted into an analog one and this then signal will feed into a monitor, an analog monitor such as a CRT monitor that will convert the image into an electromagnetic wave that is going to reach the human visual system. The eye of the observer. Again, nowadays digital monitors or flat panels are prevalent. So an LCD here will be fed by the digital signal directly. A video adapter plays the role of the D to A converter. So in this class, we'll say a few things about sampling and quantization, Of images that are represented by this A to D block, and actually could be resampling or requantization as we'll see. We'll say a few things about properties of the visual system. But by and large, our main focus is in this center block where the input is a digital image and the output is another processed digital image one. To demonstrate something, or better yet, resampling, let's consider this digital image. It's a 256 by 256 image, which means it has 256 rows and 256 columns and it's an eight bit image, which means that each element here which is called a pixel from picture element. Here's 2 to the eighth. Or one of 256 values, okay? So I'll take this image and I will down-sample it by a factor of eight in both the horizontal and the vertical directions. So out of eight samples, horizontally I'll keep one. I'll do the same in the vertical direction. Or in other words, if you consider here an eight by eight block I will only keep this value and throw away the rest. So I keep one and throw away 63. Okay? If I do this, then I end up with a 32 by 32 image. This tiny image shown here. If I bring it back to the 256 by 256 dimension for visualization purposes, then that's how this image will look like. So, you see that due to this down-sampling, you see these blocking artifacts and these so-called jagged edges, right? They're not straight edges anymore, but they're jagged. Of course, to up-sample the image, I did something very simple, I used so-called zero-order hold. Which means I just took an eight by eight block, I only had this value and we're missing the other 63 values. And I simply gave to the other 63 values the value of this, so each eight by eight block has the same value. And this is apparent by looking at this image. This is an eight by eight block, for example. And it has constant value and constant intensity. Similarly, the idea of requantization is depicted here. Over here is the eight bit per pixel image we started with, so here we have two to the eighth 256 different gray values. And right here, I have only two to the 4th equals 16 values to represent the different intensities. And over here, I go down to two to the 2nd, that is four different values. So clearly, you see the so-called contouring effects that are visible here even, right? So these are artificial contours or boundaries that have nothing to do with the original image. So you save bits clearly, we're going from eight bits to four, to two bits to represent intensity, but the same time you introduce errors, the so-called quantization errors.