0:23

So we'll be covering several of the basics they are considered to be the basics

Â in computer science.

Â Typically some of these concepts are concepts that computer science students

Â learn in the first couple of years in the computer science undergraduate degree

Â program.

Â So for those of you who are already familiar with this material this

Â could be a refresher or you could just skip it if you already know this stuff.

Â In any case, you can always use this orientation lecture as a reference to

Â keep coming back to if you encounter a term during the regular lectures

Â that doesn't make sense to you, or that you'd like to be reminded about.

Â So we'll discuss a few topics.

Â We'll discuss some basic data structures, what processes are.

Â We'll discuss a little bit about computer architecture, and then a little bit of

Â math with big O() notation, basic probability, and a few other basic topics.

Â So let's get started here.

Â So basic data structures, we'll discuss two different data structures.

Â The first one is a queue.

Â A queue is essentially a first-in first-out data structure.

Â Elements of the queue could be any data types,

Â in this example here I have elements that are in T years, so

Â in this example right now the queue has five elements, 3, 5, 8, 0, 7.

Â This is an ordered list of elements.

Â When you remove an element from the queue, you remove it from the head of the queue.

Â In this case, the head is the element 3.

Â When you insert a new element, you insert it at the tail of the queue,

Â which means that if I insert a new element, say 1,

Â it's going to get inserted right after 7 in this queue.

Â 1:56

So removal always come from the head,

Â while insertion always happens at the tail.

Â So for instance, given this particular queue consisting of five elements,

Â if you dequeue or remove an item, then that item would be 3.

Â At that part of time, the queue will only consist of 5, 8,

Â 0, and 7, with the head pointing to 5.

Â The next item that you dequeue or remove will be 5.

Â After that, the queue will consist of only 8, 0, and 7.

Â Then the next item that you dequeue will be 8, and so on, and so forth.

Â Dequeues, or removals, as well as insertions can occur concurrently.

Â So you can remove an element, but at the same time,

Â also insert an element at the tail.

Â So that's a queue, which is a first-in-first-out data structure.

Â A different data structure is the stack,

Â which is the first-in-last-out data structure.

Â Imagine a stack of dishes that you place on your table, the dish that you put on

Â the top, that you add the last will be the first one that you're going to remove,

Â all right, and that's essentially a stack.

Â So remove, or also known as a pop, happens from the top, and insert or

Â a push also happens at the top.

Â So, in this case, if you push a new element into this particular stack,

Â say 9, then 9 will go above 3.

Â After that, if you pop or remove an element that's going to

Â be the last element that you added, which is 9 in this case.

Â The next pop after that will get 3, because 9 was just removed, and

Â 3 is now the top of the stack.

Â After you remove 3, the next pop or removal will get you 5, and so on,

Â and so forth.

Â After that, you get 8, and then 0, and 7.

Â So once again, the removal always returns you the last item that was added and

Â insertion or push adds an item right to the top of the stack.

Â So these two data structures, queue and stack, are used very widely

Â in computer science, and in fact, we'll be using the notion of a stack right away

Â when we discuss processes soon in this particular orientation lecture and stuff.

Â What is a process?

Â A process is essentially a program in action, for instance,

Â you might write a program in a language like C, C++, Java, or Python,, or Ruby and

Â [INAUDIBLE], whatever it is, it doesn't matter what the language is, and

Â when you execute that program, say from the command line or

Â from a GUI, that executing program is called a process.

Â For instance, here I have a C like program, which has a function main,

Â which in turn calls a method f1, which in turns calls a method f2.

Â This is shown pictorially here f1 main calling f2.

Â When you instantiate this program and execute it, it becomes a process,

Â which here in this case is labeled as P1.

Â So this code part shown here is only program itself,

Â the process itself consists of wonderful other components.

Â The code typically doesn't change, but the other components might change.

Â What are these other components?

Â Let's look at them one by one.

Â First of all there is a program counter, which is an index into the code,

Â which says where the program is right now.

Â What is the exact line number with in the code,

Â where the program is executing right now.

Â Next you have a stack,

Â which is used by the functions to pass arguments and returned values.

Â For instance, when the main function calls f1 any arguments that it needs to

Â pass to f1 are passed via the stack by pushing an element on top of the stack.

Â F1 can then pop that element off and

Â obtain the arguments that are passed to it.

Â Similarly when f1 calls f2 it then pushes on top of the stack, the arguments that

Â it has for f2 and f2 can retrieve these arguments by popping the top of the stack.

Â When f2 is done, it can return the return values

Â to f1 also by pushing the return values on top of the stack, so that when f1 resumes,

Â it can pop the top of the stack to obtain the return values from f2.

Â Similarly, f1 returns its return values to main by pushing on top of the stack.

Â 5:42

Now the functions themselves might declare variables, some variance might be local

Â variables, and local variables are also typically maintained on top of the stack.

Â I'm not showing them here, and this is the reason why when a function or

Â method is completed in its execution, any local variables that are declared

Â inside are then gone, because that part of the stack has been popped.

Â In addition, function that might also create new memory by for

Â instance calling myLocalNew.

Â In this case, I'm showing such as a variable x, and such variables are stored

Â in a separate data structure associated with the process known as the heap.

Â Also, associated with the heap are also registers,

Â which typically refer to the recently accessed variables in the process itself.

Â So, for instance this variable x is present in the heap.

Â Such or new variables need to be explicitly freed in most cases, however,

Â some programming languages have automatic garbage collection.

Â 6:36

Now the reason we are discussing the process here is,

Â really the main reason is that, when we talk of distributed systems,

Â we're going to talk about multiple processes communicating with each other,

Â and we'll mostly focus on process to process communication, and

Â then later on we will also talk about how procedure calls or

Â function calls can cross process boundaries.

Â In addition, the other reason we're discussing the process here is that when

Â we take a snapshot of a process, when I say a process takes it's own snapshot,

Â this essentially includes everything that I'm showing here on the slide, and

Â maybe a few more elements.

Â It includes the code, the program counter, the current status of the stack,

Â the current status of the heap, and the registers, and then a few other elements.

Â Essentially this information is enough to resume this process from here onwards.

Â Speaking of computer architecture,

Â let's discuss a simplified version of computer architecture.

Â Computer architectures today can be very complicated, but

Â typically when we think of how a process executes on a computer architecture,

Â we can think of the simplified view.

Â So there's the CPU, which executes the actual instructions,

Â which are present in your code.

Â There are also a bunch of registers that are code located with the CPU.

Â These are small pieces of memory that can be accessed very quickly by the CPU.

Â Typically there's only a small number of registers,

Â typically not more than a few tens of registers.

Â There is also a cache, which is a slightly larger memory than the collection of

Â registers, and the larger a piece of memory is, the slower it is to access it.

Â So accessing a cache is slower than registers The next is registers but

Â accessing the cache is still pretty fats.

Â 8:04

Then beyond the cache there is the main memory or the Random Access Memory or

Â the RAM, which is even larger than the cache and

Â the memory itself, therefore, it's slower than the cache but still it's pretty fast.

Â And then, finally, there is the hard disk or if you would like solid state drive,

Â flash driver or whatever, which is a lot more memory than the main memory.

Â But of course, it's much slower.

Â So as you go up from disk to memory, to cache, to registers, the speed increases.

Â The total amount of storage space that you have decreases So

Â when you write a program such as C++, or Java, or C and

Â you compile that it gets compiled to low level machine instructions.

Â These machine instructions might be specific to the architecture

Â of the machine on which you're running.

Â Or it might be a virtual machine like a JVM kind of code.

Â In any case, these low level machine instructions are the executable version of

Â your program, and this is stored in the file system on your disk.

Â When your program starts executing,

Â when it becomes a process, the CPU loads instruction in batches.

Â Or more simplified way, it loads instructions one by one.

Â 12:17

Once again, when we say order N squared or

Â order N, notice that we do not include the constant c in the definition itself.

Â This is because the consonant is irrelevant, as long as there is a some

Â fixed constant, that is good enough for us as far as Big O notation is concerned.

Â So we do not state the constants in Big O notation.

Â So let's see a couple of examples.

Â So first, I give you an unsorted list of, say, integers and

Â I ask you to search for the specific element in the list.

Â This is actually means that you have to iterate through the list one item at

Â a time and search for

Â the element by trying to match it with each element already present in the list.

Â What is the worst case performance?

Â That's what we need for the big O notation, right?

Â So the worst case performance happens when either the element is not there at all

Â in the list, so you return a false or the element is the very last one in the list,

Â the very last one that you matched against.

Â Therefore, the worst case performance involves N operations and

Â the number of operations involved is less than c times N, where c

Â = 2 because the number of operations is N where N is the size of the list.

Â Okay, so the time as well as the number of operations assuming

Â each operation takes at one unit of time.

Â Both the time taken for this algorithm,

Â which iterates through the list as well as the number of operations are both order N.

Â That's why we say order N over here.

Â 15:52

Okay, so now probability, any event has a probability of happening.

Â So let me give you an example.

Â If you wake up at a random hour of the day Say you just wake up at some random hour

Â of the day.

Â Say you're jet lagged or something, and I ask you what is the probability that

Â the event at the time is between 10 AM and 11 AM?

Â Let's try to calculate this.

Â Well, there are 24 hours in a day, so

Â you have a set that consists of 24 elements, one for each hour.

Â 12 AM, 1 AM, 2 AM, all the way through 10 AM, 11 AM, 11 PM.

Â And so when I say an element like 1 AM, I mean the hour between 1 AM and 1:59 AM.

Â Similarly, 10 AM means 10 AM and 10:59 AM.

Â Out of the set of 24 elements, you pick one at random.

Â And the probability that you're going to pick 10 AM is essentially 1 divided by 24.

Â Because you're picking one element at random, and

Â there are 24 elements in the set.

Â And the probability that you pick that one special element to 10

Â AM is 1 over 24 because you want to be lucky to pick that 10, okay?

Â So the probability that if you wake up at random hour of the day

Â the time is between 10 AM and 11 AM, then that probability is 1 over 24.

Â 17:07

Now there are multiple events, and you might need to calculate the probability of

Â conjunctions or disjunctions of these events.

Â I'll describe what these are soon.

Â So say E1 is one event, and E2 is another event, and

Â E1 and E2 are independent of each other.

Â This means that E1 does not influence E2, E2 does not influence E1.

Â Then the probability that E1 and

Â E2 both happen is essentially the multiplication of these probabilities.

Â So you take the probability of E1, multiply it by the probability of E2, and

Â that gives you the probability that both E1 and E2 are true.

Â Let's discuss an example of this.

Â Suppose you have three shirts, they're colored blue, green, and red.

Â And also you wake up at a random hour and blindly pick a shirt.

Â You put your hand into your closet and you blindly pick out a shirt.

Â What is the probability that you woke up between 10 AM and 11 AM, and

Â that you picked a green shirt to wear?

Â Well, the first probability of the first event, that you woke up between 10 and

Â 11 is essentially 1 over 24, like we calculated on the previous slide.

Â The probability that you picked the green shirt is essentially one out of three

Â because there are three colors of shirts, you have three shirts.

Â And you want to pick the green one.

Â You want to be lucky to pick the green one, so that's 1 over 3.

Â And so the probability that both these events are true,

Â that you woke up between 10 and 11 and that you wore the green shirt,

Â is the multiplication of the probabilities of those events.

Â So that's 1 over 24 times 1 over 3, and that gives you 1 over 72.

Â One of the things that you have to be careful about here is that if E1 and

Â E2 are not independent, meaning that they influence each other in some way,

Â then you cannot multiply the probabilities with each other, okay?

Â And so for instance, if the shirts that you have in your closet change from

Â one hour to another, because say someone in your house changes them around.

Â Then you can't necessarily multiply these probabilities as they are here.

Â A different thing that you may need to do with two event probabilities is to

Â order them.

Â So if I give you two events E1 and E2, and

Â I ask you the probability of either E1 or E2 happening.

Â Then you can say that the probability of E1 or E2 is the probability of

Â E1 plus the probability of E2 minus the probability of E1 and E2.

Â Okay, so this is the intersection probability here.

Â If you do not know probability of E1 and E2, this last term over here,

Â then you can write the inequality probability of E1 or E2 is less than or

Â equal to the sum of the constituent probabilities.

Â 20:39

Now, the IP addresses might refer to a actual web server,

Â a unique web server that actually hosts that content, so

Â that now your client browser can send a DCP and HTTP request to that server.

Â Or it might refer to an indirect server or

Â even a group of servers such as a content distribution network, such as from Akami.

Â The funny thing about names in your system is that they can be changed.

Â So the ID or the address obtained from one name entered in

Â your system can be given as name to another name in your system, and

Â that can return a different address, and so on and so forth.

Â And you can keep doing this until you reach your final object itself.

Â Graphs, so when we say graphs in computer science,

Â we don't necessarily mean plots which have X and Y axis.

Â A graph is essentially a network.

Â So here I'm showing you a graph of some cities, Amsterdam,

Â Boston, Chicago, Delhi, and Edmonton.

Â And the travel times between those cities by air, rounded up to the nearest hour.

Â Okay, so travelling from Amsterdam to Boston takes about six hours.

Â Travelling from Chicago to Delhi takes about 14 hours.

Â Say these are the flights that are available on some particular

Â 21:58

These all have names in the world of graphs.

Â So each city would be a node or a vertex, each connection between a pair of nodes

Â will be called an edge, and the value along an edge is called the edge weight.

Â Also when I say that A is adjacent to B,

Â this means that A has an edge that connects it to B directly.

Â Typically an edge goes between two nodes or two vertices.

Â An edge does not cross three nodes.

Â So here, for instance, I have one, two, three, four, five vertices.

Â And how many edges do I have?

Â I have an AE edge, an AB edge, a BE edge, a BC edge, and a CD edge.

Â So I also have five different edges, each with its own edge weight.

Â