So before we go into graph analytics, the first question we want to ask is, what is a graph? Let's see. This is a plot of the sales of some items against time and it gives you a nice visual representation of the data, but is it a graph? While you think about it, let's try another one. This is another common visual representation of data. If you go to Google and type pie graph, you will see many results that look like this. But is this a graph? In our research, this is not a graph. We call it a chart. Yes, you guessed it right this time. This too is a chart, and not a graph. So why do people call them graphs, then? In a sense, they're abbreviating. A chart depicts what's called a graph of a function. Let me explain. Look at the two column table on the right. The first column has information about product category with values like furniture, office supplies, and technology. The second column represents another set of values called total containing 1724, 4610, and 2065. Now we can define our mapping option. Which means a correspondence from Product Category to Total. So we map Furniture to the value 1,724, Office Supplies to the value 4,610, and so forth. If we visually portray this, we can represent it like a bar chart or a pie chart. This is why both the previous diagram and this diagram are charts and not graphs. Graph analytics has its basis in a brand of mathematics called graph theory. What's more interesting, however, is that graph theory was born out of a very practical problem. The problem started in a very old city in Prussia which is now in Russia called Konigsberg. Even if we look at in in Google Maps it would sort of look like this. One of the interesting features of Konigsberg is that it has two islands and these islands are connected by seven bridges. Back in 1736, the city wanted to create a walkway, and the criteria was, this walkway would traverse all seven bridges such that somebody wanting to go from one part of the city to another can cross a bridge only once. Now this is sort of an urban planning problem right? Well in fact, it required a mathematician to solve the problem. The mathematician named Euler shown on the left looked at this and figured out that you really cannot create such a walk. Why? He said, it cannot be done, because there are an odd number of bridges, and proved it mathematically. From this problem, the whole field called graph theory emerged. On the right, you can see Edsger Dijkstra, a well-known computer scientist who has developed graph algorithms, one of which we will study later in the course. His work has had far further impact on both the theoretical computer science and practical applications. What's the difference between the mathematics view and the computer science view? Let's try to define the mathematical view of graphs. We start with a set of vertices. Here we have a set of six nodes or vertices. Now, I'll add another set. I'll call this the set of edges. In our diagram there are only four edges, but there's something special about these edges. Each edge is just not an ordinary atomic object. An edge like e1, actually, is one term from v and then a second term from v. So this edge, e1, goes from v1 to v5. Pictorially, we can draw and arrow from v1 to v5. Now if I had said v5 to v1, the arrow would be reversed. So what do we have so far? We have a set of vertices and a set of edges. That is the mathematical definition of a graph. What about the computer scientist's definition? Of course a computer scientist needs to adhere to the mathematical definition, but they want to represent and manipulate the same information. So they need a data structure. In other words, something they can operate on. So what kind of operations would they do? Well, with a graph, they can say add edge, or add a new vertex, find the nearest neighbors of the vertex where the term neighbor refers to the nodes connected to the vertex. We said a computer scientist needs to represent graphs using data structures. Here is one that you should recognize. It's a matrix, called the adjacency matrix of a graph. Both the rows and columns of this matrix represent notes. If I go from v1 one along the row, I see that there's one at v3, which means there is an edge from v1 to v3. Similarly there is another from v1 to v5. Let's look at some operation. Let's say I first want to add an edge from v3 to v2. What should I do? I'll start from the row v3, go up to the column v2, and add a 1 in that set. So I've added an edge, which is an operation on the matrix. Another operation. I want to get the neighbors a v3. This will be a little more complicated. I look at the row v3 and paint that row and I'll also look at the column v3 and paint that column. If we go down the row of v3, we'll get v2, so v2 is neighbor. And if we go down the column of v3, we'll get v1 and v6. So v1 and v6 have alternators. What's the difference? Since the matrix has the From along the rows and the Tos along the columns, following the row v3 gives us the edges outgoing from v3. And following the column v3 gives us the edges coming into v3. Is this the only representation of the graph? No it's not. Here is another one, in this representation there are two kinds of data objects. No data, which are the blue rectangles and edge data which are the triangles. To get a sense of this representation, look at the note v1 and follow the top yellow line. It'll reach v3. So this link structure directly captures the graph diagram we created before. Now, start from v1 again and follow the blue link and it will reach e1 and then v3. So that tells you that e1 is an edge object between v1 and v3. Next, let's stand on the edge triangle e1, and follow the red dashed line to get to e2, which is the next edge from the same starting node. Is this possibly too complex? Yes it is. However, as we'll see down the road, many graph database systems are using this kind of data structure internally, so that the database operations become more efficient.