[MUSIC] Formulating and analytics problem. I'm going back to problem formulation. I'm going back to where we started in session one, for good reason. Because most of the data that we are going to use here, will not be primary informed it will be digital, it will be secondary, let's see these in action. Here's an example, the background, the Earth's population is slated to rise from 7.3 billion people now to 9.7 billion people by 2050. Average incomes will also rise, which basically means that people will want to eat more and eat better. Which basically means more meat, which basically means more resources needed to raise them. So the question then becomes how can the world be fed in the future without putting irreparable strain on the Earth's soil and oceans? By 2050, the FAO which is the Food and Agriculture Organization part of the UN, its 2009 report says that by 2050, agri production will have to rise by something like 70% to meet projected demand. There is no way out of this other than the industrialization of agriculture. The most land suitable for farming would already been farmed. It basically means this growth has to come from higher yields. Higher yields have already happened. Lot of agriculture has undergone yield enhancing shifts. But even since then applied total think about the green revolution in the 60s and 70s. To go beyond that, we are now going to leverage technology, NDS analytics. Let's see this. Consider a farmer's challenges, sources of risk or uncertainty. Take a minute, type what you think are the farmer's challenges. You're a farmer. What are the sources of risk, of uncertainty that you face? Just write it out in a minute or two and then we will proceed. So there are a number of risks, and what I have here is an incomplete list. The biggest one probably is the weather. The soil's moisture levels, the soil's nutrient content, the competition to crops from weeds. These are all risks and uncertainties. That is a threat from pest centrum diseases and the cost of taking actions should something like this happen. All of them will basically in some sense way on farmers mind. So think of the problem in some sense. Think of the farmer's problem as a matrix with rows and columns. Inputs are the variables. They are the columns. And rows are actually individual plants. We will see this in action just how long. Two questions arise at this point. One, how to cost-effectively populate this matrix? If I have matrix were each row is a single plant, and each column is things like a weather, and soils moisture level, and soils nutrient content, and the number of weeds around it, I can actually populate the entire matrix. It's going to be time consuming how to do it cost effectively and two, once populated how do you analyze the data to optimize yields and maximize profit. You populate the matrix and data analytic will do. Populating the matrix, how would you cost effectively populate the matrix. With data on, let's say, interplant distances, data on weed growth, data on soil conditions, how? What I want you to do is take a minute, think about it, what is the easiest, best, fastest, cheapest way to fill this matrix up? Type a one line answer and then we'll proceed. Well, there are two ways this can be done, the first is to take the aerial route. Drones or planes flying on the fields. So they overfly the field. And when they do well you could take, that's what it look like in some sense. When a drone or plane flies over it. You can photograph the fields in high definition, or you could use even better, something called multi spectral analysis. So you have these cameras that see beyond the visible spectrum, both in into ultraviolet. What do they do to figure out crowd density, to figure out crop held, to figure out soil conditions. I can tell you moisture content in the surface by using multispectral analysis. Why? Because wet soil tends to reflect a different wavelength than dry soil. Can I identify weeds? Yes, because they would reflect, their leaves reflect a different wavelength. Than that of the crop, I can do all of that by one overpass, one fly by. You can also do accurate contour mapping, I'll maybe not go there, but all of that is possible. Alternately, you could take the land route. So basically you have these machines that do soil planting and so on. They can be GPS enabled, so sowing, harvesting, seeding, all of that. And in fact, the John Deere, all their machines are GPS enabled. They basically use their VR information company too. You can control inter-seed distance while sowing. You can do water fertilizer supply, all of that. Based on local soil conditions, based on corn tools, based on all of that. And you can in some sense, record harvested quantities. I'll come to that. That is important, that becomes the Y variable. The yield map for the whole field. I was talking about machine vision, multispectral images and so on. Yes, advances in machine vision and image analysis have made this possible. Here's the link with digital media. The digital link is that all this data is digital in form. The applications are there from geomapping and imaging using satellite data, enormous possibilities there. Once the data sets already from image analysis, you can just open the Analytics Toolbox. Here is an example of just how good machine vision has gotten of late. That is the annual rate, and you can see it basically coming down, down, down until, in 2015, it crossed a great milestone. It basically came below the human error rate in the media analysis. All items from here it just get better for the machines. Consider an example of what the datasets might look like now. I have soil conductivity. I have crop yield. So basically what they may just look like? It's hard to make out from here but you put a machine to work on this and it will be able to give you different matrices, individual columns populated based on these kind of pictures. A set of inputs and a set of outcomes. So if you go back to what we had and how analytics work. I have a set of outcomes. I have a set of inputs. I train my model, I train the machine. We calibrate them, we test them and after that we hold out samples in some sense. An unknown fresh samples we are able to get them to predict we basically in some sense sums up everything that we've done, thus far. So we've actually come to the end of this introductory course the terms business analytics, and digital media has wide usage. They mean different things to different people. We will define them in a simple way for our own purposes. We started with problem formulation in session one. We then move on to Tools Scape. Basically to basic approaches to how to solve the problems we formulated. And this came about in session two. Session three if you remember what we did, we basically applied those approaches on to customer analytic problems. And finally, in session four we basically put everything into a digital, social, non-social context. With that, this course comes to an end. [MUSIC]