The most primitive path analytics question one can ask is to find the best path from node one to node two. What does best mean? Well, that depends on the actual needs of the application. But in general, to specify the best path, we need to define when one path is better than another. This is usually expressed in terms of an optimization problem. Where we need to minimize and maximize our subfunction subject to constraints. What kind of constraints? Two common criteria for graphs are, inclusion and exclusion conditions. Inclusion criteria may specify which nodes we have to include in the path. And exclusion criteria specifies which nodes and edges should be excluded from the path. In addition, one specify a preference criteria that act as a softer or less strict constraint. For example, we would like to minimize highways on my trip. Or avoid condition. These are soft because although the users would like to have them enforced completely, it is all right if they are not fully enforced. A good practical use case occurs when I'm trying to drive to work in the morning. Ideally, I would like to take a path having the shortest distance from my home. For example, node I, to my workplace, which is node B in the graph. But I have to drop off my son at school. So my path must include his school, the J here. However, I would like to avoid roads around the new construction that's happening about five miles from my workplace, like the node E in the graph. Because there is usually a huge traffic delay around that construction site. I could also add a preference criteria like I don't prefer to drive on the highway. But for this discussion, we'll skip the preference idea. Too complicated? Okay, let's start with a simpler problem. To start with, let's drop the constraints and look at the problem with just the optimization part of the problem, having a single variable. In our case, that variable is the sum of edge weights from the source, that is the starting node I, to the target, which is B. This problem is handled by all mapping and road direction software. Here is a Google map screenshot, in which I am trying to go from my home in North San Diego, to a collision center in a nearby city. Google Maps shows three different routes, and highlights one as a preferred solution. You should readily see that the real shortest path of 26.6 miles will take the longest time at the time of the day when I was looking at the map. So this means the weights here are not raw distances but estimated travel time. You should also notice the blue, red, and orange segments in the preferred path are presented by Google. The orange and red street segments clearly represent congestion areas and therefore have higher weight than the blue segments. Therefore, the weights of the street segments are not really static but change with many other factors, like weather or the time of the day. This is why the least weight path problem is an important problem to solve for the benefit of the commuter. A widely applied algorithm that is applied for shortest path problems is called Dijkstra's algorithm. Originally, Dijkstra considered a variant of the problem where the task is to find the shortest path from a single source node to all other nodes. We'll go through the algorithm here. However, there are many good online resources including tutorials and YouTube videos describing the algorithm. For our discussion, we'll confine ourselves to the case where both the source and the target nodes are known in advance.