Hello I'm Professor Brian Bushee, welcome back. In this video, we're going to let the data speak, literally let the data speak. We're gonna throw out any notion of trying to model discretionary earnings or unusual trends and ratios and instead just look at the distribution of the numbers. There's something called Benford's Law, which reports how frequently certain numbers appear in naturally generated data sets. But that means that in artificially generated data sets, for example, financial statements that have been manipulated through earnings management, the numbers won't appear in the same frequency as Benford's Law. And so we can look for deviations from Benford's Law to find financial statements that look fishy. It's pretty off-the-wall stuff but it's gaining prominence as a fraud detection tool, so let's take a look at how it works. Okay, for our last model we're gonna throw away all notions of trying to model any discretionary earnings or fraud prediction ratios. And instead we're just gonna let the data speak, to tell us whether it's true or whether it's been manipulated. And we're gonna do this with something called Benford's Law. So, back in 1881, Simon Newcomb determined that the probability that a number has a first digit, d, is given by this equation here. Now, apparently he discovered this because back then you had to use books of logarithms to do complex calculations. And he noticed that the pages in the ones were much more worn than the pages in the nines. So, he got the idea that ones must appear more often than nines as leading digits, and what he found is that they seem to follow this distribution. In 1938, Frank Benford found a large number of naturally occurring data sets follow this pattern. So, you find this in the surface area of rivers, molecular weights, death rates, street addresses and the numbers contained in an issue of Reader's Digest. So, much like what always happens is the person that discovers something doesn't get the credit, it's the person that popularizes it. So this became known as Benford's Law instead of Newcomb's Law. And since then it's been used to find irregularities in published scientific studies, fraudulent election data in Iran, suspicious macroeconomic data from Greece, and tax return misreporting. So in each of these cases the distribution of the numbers and what you're looking at doesn't meet this equation first discovered by Simon Newcomb and popularized by Frank Benford. And we're gonna use this distribution to look for irregularities in financial statements. This is, perhaps, the silliest thing I have ever heard in my life. And I am no spring chicken! >> What could possibly be the explanation for such an odd law of nature? >> So, Benford's Law is what's called a phenomenological law, which means that it explains a phenomenon that occurs in nature, but we have no idea why it occurs. In fact, I found a quote where a leading scientist had said it continues to defy attempts at an easy derivation. So, if the scientists don't know a good explanation, then I certainly can't come up with one. But it seems to work in a lot of unconstrained, naturally occurring data sets. And just because we don't understand why it happens doesn't mean that we can't use it to our benefit. So, let's go on. Here's what the expected distribution of leading digits is under Benford's Law. So you'd expect to see a 1 as the first digit about 30% of the time. Whereas a 9 would be a first digit only about 5% of the time. Notice that 0 doesn't appear here, so if you're working with a decimal number like 0.05, we would consider 5 to be the leading digit. There's another example to show how this can detect fraud. Here's the distribution of leading digits from 215 months of returns for the Fairfield Sentry Fund which was a fund that invested only with Bernie Madoff. And if you haven't heard of Bernie Madoff maybe you should Google him. It's sort of an interesting read. But he ran this thing called a Ponzi scheme, which was a fraudulent investment fund where he basically would pay off investors from new money that he would attract. So, as long as he would attract new money, he could pay back to investors. But if the new money ever stopped, then the Ponzi scheme would collapse. Anyway, if you look at the distribution of the first digit of the returns, you find that there were way too many 1's in the first digits, not enough 9's. And so the more you see a discrepancy from Benford's Law, the more likely it is you're looking at numbers that have been made up, as opposed to represent real returns earned by a real fund. >> Doesn't this just indicate that Bernie Madoff was not that good at committing fraud? I mean, wouldn't an expert Ponzi schemer remember to make up returns that conformed to the Benford distribution? Duh! >> That raises an excellent point, that as Benford's Law becomes a more popular fraud detection technique, eventually the fraudsters will catch on. And they will make sure to commit their frauds in a way where the financial statement numbers conform to the Benford Law of Distribution. This happens with any kind of fraud detection tool where the more that the fraudsters know about it, the more they can try to make their numbers meet the expected level of the model. But nobody's found a way to manipulate all of the tools that we've looked at in this course, which means that even though Benford's Law may work in a lot of cases, you don't wanna rely on it exclusively. Because the manager can't manipulate their numbers to make everything look good, they're gonna trip up somewhere, and that's why you gotta look at multiple tools. The approach that we're gonna use to apply Benford's Law to financial statements was laid out in a paper by Dan Amiram, Zahn Bozanic, and Ethan Rouen in 2014. What they did was they aggregated all the financial statement data they could get by industries, by years, they looked from 2000 to 2011. And that they found that the leading digits in finance statements tend to follow Benford's distribution. In fact, 83% of firm's financial statements conform with this distribution. So when it doesn't conform, when there's larger discrepancies between the distribution of numbers in the financial statement and the Benford distribution, they also found larger Modified Jones Model discretionary accruals and higher Beneish M-Scores. So the discrepancies did seem to line up with other tools that we've looked at to detect earnings management. So here's how we're gonna calculate the discrepancy from Benford's Law. First, we need a large number of financial statement items. There's no theory on how many are needed but more is better. So one good way to do this is to basically take all the numbers in the balance sheet, income statement, cash flow statement and run this test on them. It doesn't matter what the numbers are, it just matters that we get these numbers. Then we need to count the number of each leading digit. So, we can use an Excel formula where we take the left most digit. Now, you notice I've got an absolute value here. That's because with negative numbers it would grab the minus sign so we need to make it positive. You also need to multiply it by 1,000 or some large multiple of 10 before taking the digits, so that will help you with the decimals. Cuz we don't want zero as the leading digit, we want the first non-zero number. Once we've got all the leading digits from all the numbers, we wanna count them up with an Excel formula to get the amount of, or the number of leading digits for each number one through nine. Then we can compare the actual frequency of these leading digits to the expected distribution under Benford's Law. And we'll test whether it seems to be different, using something called the Kolmogorov-Smirnov statistic, not sure how to pronounce that, which is this sort of ugly formula here. In this formula, AD is the actual frequency of the leading digit, so AD is the actual frequency for 1 is the first number, AD2 is for 2 is the first number. ED is the theoretical frequency or what's expected under the Benford distribution. And basically, what this statistic is doing is it's looking for the biggest point or the biggest cumulative difference in the distribution. Then to see if that's statistically significant, if it's consistent with earnings management, you wanna see if this statistic is greater than 1.36 divided by the square root of P, where P is the total number of leading digits used. So, this is something where the cut off we're gonna use is a function of the number of leading digits. >> Sweetie, y'all can't just give us the Kolmogorov-Smirnov statistic and expect us to know what the lamb y'all are talking about. >> I completely agree, let's take a look at an example of how this works. So, to illustrate how to calculate this, I've put together a spreadsheet with two companies Dog Donut and Beagle Bagel, two leading snack food, breakfast food companies for dogs. In the first tab for Dog Donut I've got three years worth of their financial statements, balance sheet, income statement, and cash flow statement. Then what I need to do is use the formula to find the leading digit for each number. So, for 129.30, the leading digit's 1, for 315.46 the leading digit is 3. One that I want to show you is down here. There's a number that's 0.51, we can't have 0 as our leading digit, so we wanna pull the 5 as the leading digit. So we got all the leading digits from all these numbers and then we want to count everything up. So, I've got a formula where I count up the number of times 1 is the leading digit, the number of times 2 is the leading digit, then we add that up. So, there were a 122 leading digits. We can divide each count by the percent to see that in this example, 1's were the leading digit 31% of the time, 2's 13% of the time and so on. Then I've got the expected distribution based on Benford's Law. And what I calculate is what's called the cumulative difference. So, for number one it's simple, it's just the difference between the actual and expected distribution for one. For two, the cumulative differences, I have to add up the distribution for one and two, and compare that to the expected distribution for one and two, and I find the difference is 2.6%. For three, I add up 31%, 13.9%, 20.5%, and compare that to 30.1, 17.6, 12.5 and they're off by 5 and so forth and so on. So, the KS statistic is the maximum the difference which would be 5.4, the cutoff is this formula of 1.36 divided by the square root of 122, which is the number of items. We can see that the KS statistic is way below the cut off. So, we have no concerns in this case that there's manipulation based on a deviation from Benford's law. It conforms to Benford's law pretty closely. >> Why do you combine all three years into one statistic? Can't you do this for each individual year? >> Excellent point. This is definitely a technique that works much better the more numbers that you have. It doesn't work as well at detecting fraud if you only have a very small number of numbers to take a look at. So, that's why using three years of numbers, which is sort of the maximum amount of numbers you can get in a financial statement, is gonna be more powerful. Also, remember we've seen examples where if you manipulated one year, it also affects the other year? So, it's easier to pick up frauds if you find these deviations across both years as opposed to one unusual year. And definitely if you dig through this example, it's a case where you wouldn't clearly find fraud looking at individual years. But looking at all three years together, it does seem like these were artificially generated numbers consistent with earnings management. Now, we can do the same thing for Beagle Bagel. So, we put in all the financial statement numbers we can find for the last three years, calculate all the leading digits, then count up the number of each of the leading digits to get the actual distribution compared to the expected distribution. And we end up with a KS statistic of 16.2%, that's based on the cumulative difference between the number of one, twos, and threes, actually versus what the Benford Law would say. That 16.2% is greater than our cutoff of 12.3%. So, in this case we would be suspicious that there was manipulation going on with Beagle Bagel because their financial statements don't meet this Benford distribution. And we looked at one of their close competitors where their financial statement does beat it. So we can attribute this to some kind of industry effect, instead it looks like they may have done somethings to manipulate there financial statements. >> Professor, I want to enthusiastically and sincerely thank you for giving us 359 tools for analyzing financial statements. Now, I have just one more small favor to ask. Which doggone tool should I use? >> You're welcome, I've been happy to do this. And I will ignore the sarcasm in saying 359 tools. So which ones should you use? Well, you should use all of them. [LAUGH] So one thing that I hoped has definitely come across in these videos, is that there is no one tool that will pick up all forms of earnings management. If there was one tool that always worked, I certainly wouldn't tell you about it. I would keep it to myself and get rich so I wouldn't have to sit here making these videos. Also, as we talked about it earlier, if there was one tool that always seem to work, the fraudsters would find out about it and do their manipulation in a way that the tool wouldn't pick it up. So that's why you need to look at a big range of tools to find these kinds of problems. The big data approaches of this week are good starting points. So if you've got a large number of companies to look at, you run the Benford's test or the fraud prediction test, figure out which five or six look suspicious, and then dive in deeper to look at the other tools to see if there's a problem. Because one thing I'm pretty sure about is that the more tools that suggest there's manipulation, the more likely it is that you've found a company that's manipulated their earnings. That wraps our look at Benford's Law and it also wraps our week on big data approaches to try to detect earnings management or fraud. I hope some of these tools come in handy for you in the future and help you identify some firms that may have fishy financial statements, so that you can stay as far away from them as possible. And I really hope that you enjoyed all this material and I want to thank you for watching. >> See you next video.