This is the fourth of four optional modules in which I'll share a demo added of a student's work from a previous course. Watching this demo and it will help to synthesize some of the lessons that you've been learning in the first four weeks of this course. And will also help you to get prepared to do the periods that you'll be doing in a few weeks. Alright. So, now that you've read the piece through a few times, you can see that this is a piece about what they're calling big data analytics. And the author is talking about essentially a review paper. So, that's a little bit different than the other papers that I edited for you because this is not talking solely about the original research, it's talking about a review paper. So, it's a little bit easier to understand probably for a general audience simply because it's a little bit more high level it's not as technical. So, the others did a nice job of setting this up. Their language is generally pretty clear throughout and I'm just going to reorganize things a little bit. So, one of the things I'm going to do with this paper is notice the introductory paragraph, it starts right in on the what the name of the paper and the authors of the paper. I mean that's fine but I think we can start in with a little bit more of an interesting introductions. So, we get in the introductory paragraph that it's a review paper. They define analytics in this way and let me give this nice example of how analytics is used. So, I think in the introduction because you kind of draw the reader in. Let's just start by illustrating exactly what we mean by big data analytics. And the author has that in their paper. So, we just going to move a few things around so that we start with an illustration of what big data analytics is, rather than a definition. Illustrations are always more interesting to the reader than just a kind of a general definition. So, we get this nice statement down in the second paragraph, in our digital lives we generate huge amounts of data and that's the crux of big data analytics. So, let's just start with that. Let's pull that up and move it right to the beginning. So, in our digital lives and we're not going to want to start with that parentheses there. But I don't think we need that so let's just start with, in our digital lives, we generate huge amounts of data. And I see there's a colon here. Social relationships, purchasing behavior, watching of videos etc. I'm going to change that to video watching, just to be parallel here. Video watching and I don't like the etcetera is a little bit too informal so, we'll just drop it and say video watching. I think the reader can infer that there might be other types of data that we're generating but those are three examples. So, then we get a big data analytics aims to construct the big picture from the minutia of our digital lives. Now there's the repetition here, digital lives, digital life so that might be telling you that we actually don't need to say this again. So that's a little bit of a general way of saying what big data analytics is, right. It constructs a picture from the minutia of our digital life. We'll rather than saying something general, let's just say very specifically exactly what is it that the companies are doing with the data. So, I'm going to define big data analytics by just saying what it is that the companies are doing, rather than trying to give some kind of general definition. So what if we just jumped in with, companies are analyzing these data because we know this is about analyzing data we get this down here in the world of analyzing data. And then that what are the data being used for? Well, authors define analytics as a term that refers to any data driven decision. So the data are being analyzed in order to drive decisions. So let's just say that. Companies are analyzing these data and using them to drive decisions. Now we've defined essentially, we've defined big data analytics. So, now we can put that at the end, a practice called big data analytics. So in case the reader doesn't know what we mean by big data analytics, we've now defined it. But notice I didn't say let me define big data analytics for you. I said what it was and then I put at the end that that's what the term is called. Because it's always more interesting to say what something is in the term itself is less interesting to people so let's take that at the end. So companies are analyzing these data and using them to drive decisions a practice called big data analytics. So now you know that this paper is going to be about big data analytics, but you have a better sense of what exactly big data analytics is. And then we have this nice example about this online company Zynga. So, let's just move that up to the first paragraph. It's always good to use a really concrete example as they have here. So let's say, for example the online company Zynga, studies how its audience plays the game and uses that data effectively to modify the games. So now we get a really good example of exactly how this works. So now we kind of have a big picture statement of what exactly we're talking about. Everybody's generating data in our Facebook, in the Amazon, and all our online behavior is generating data for companies. Companies are then using that data to drive decisions. We call that big data analytics and here's a nice concrete example of a company doing that. And now we can dive into the details of this particular review article. So in a recent work on interactions with big data analytics. We probably don't need to say authors here, we can just say Daniel Fisher et al. I might change that to and colleagues. Daniel Fisher and colleagues talk about interesting developments in the world of analyzing data. Well, I think that is a little bit vague and let's just jump right into what it is they're talking about. We don't even need to give that introductory pass. I've also by the way already included the material about what the term analytics means, and that data driven decision idea is already in there. So we can cut that and just jump into and colleagues rather than say what the paper does twice, we can just jump right into it here, in a recent work on interactions with big data analytics, Daniel Fisher and colleagues, review the state of the field, the state of the field of big data analytics by interviewing 60 pioneering analysts. And then, we've got this repetition, analyst in the field. In the field they had said the state of the practice probably because they didn't want to repeat it in the field. I like state of the field better but let's just get rid of this in this field and call them big data analyst. I assume that they're big data analyst if they're pioneers in this field. So, we're going to jump right into that and then what did they do? So the paper, I'm going to say the authors discuss rather than the paper discussed that some minor stylistic thing. The authors discuss, we don't need the about here, discuss the definition of big data, contemporary ways of analyzing data, challenges peculiar to big data and proposes a five step workflow. Now, notice that when I read that out loud, you probably heard that non parallelism in there. So that sentence isn't parallel, we get discuss definition, discuss challenges, discuss contemporary ways and then we get this shift where there's a new verb introduced and proposes. So, it doesn't quite work we're going to have to say that proposes off as a new sentence. I'm going to set it off with a semi-colon to make that parallel. So, just watch out for parallelism as we've talked about. So, let's make this a list. The others discussed the definition of big data, contemporary ways of analyzing data and challenges peculiar to big data. Wrap that list up and then start with the next idea about proposing. So they also propose a five step workflow. And then we get type of an approach to analyzing big data. Well we don't need any of that, right? So, we just say, they propose a five step workflow for analyzing big data. We don't need that type of approach too right? And I'm actually going to end that paragraph right there. So this gives us an overview of what this review article accomplishes. They discuss things like the definition, the challenges things like that and they propose a five step workflow. So then I'm going to set this next idea, this parallel with old age mainframe computing, I'm going to set that off as a new paragraph. So, this has a nice idea that the authors have this draw a refreshing parallel. That's a nice way of putting it. To the old age that needs a hyphen there, to the old age mainframe computing where the work would be submitted to massive systems and the results would be obtained after a period of time. Now, I'm going to point out that these couple of sentences have some passive voice in them, so I'm going to try to put this back into the active voice to make it a little bit more lively. To put this in the active voice we would say, the authors draw a refreshing parallel to the old age mainframe computing. And then I'm going to say who submitted the work. So, we'll say, where analysts. Notice that would be submitted is passive voice, where analysts submitted the work to massive systems and had to wait for a period of time to obtain results. Now, it would be nice to be specific, what type of period of time are we talking about? For those of us who don't really remember the old age mainframe computing, was it hours? Was it days? In general, how long? So I might just say to the author, put something concrete in there if it's possible and had to wait for hours, days, week to obtain results. So I'm going to just highlight that so that I can tell the author, well, what is it? Can you give a more specific time frame here if it's possible? Then we get big data analytics argue the authors is very similar. Well, actually you don't mean to say author because we've already said it's a parallel. So, you don't need to repeat the fact that it's similar that's what you mean by parallel. So we can just jump right into, with big data analytics, the analysis. Again, I'm going to turn this from passive voice back into the active voice. So the analysis require huge computing power. I'm trying to also make this sentence parallel with the last sentence, require huge computing power. So, scientists must submit the results. Say to a super-computer? I'm adding this detail but I would assume that they're submitting them to a super-computer and that's why they're having to wait, right? Must submit the results to a super-computer and again wait for a period of time so and wait for the results. So this sentence is now parallel to the first one. So again, we have to submit things to a super-computer and we have to wait for the results sort of like we used to submit things to mainframe computers and wait for the results. And then I would make this last thought just its own sentence. The end user computers, so we can just start with, end user computers are only used for viewing the results and not for processing. I'm going to change that just slightly to be a little more direct so, end user computers display but do not process the results. And then, I might ask the author to add just a little bit of detail here. So, that's a really nice parallel, but what are the implications of that? Can we learn something from recognizing that it has this parallel with the old age mainframe computers? So I'm going to ask the author to add something here. So, what implications does this parallel have for big data analytics? So, that's a kind of a cute parallel but if it's just cute and it doesn't have any implications and it's not that important. So there must be some implications of that. What implications does that parallel have that the the author of this paper has spent a whole paragraph or a few sentences describing? So what are the implications of that? Add that in there. Then we get to the next paragraph, and the author jumps into this five step workflow. So, here they say, pivotal contribution of the paper is the generalization of how big data analytics can be approached. Well, now as we already said up here, they propose a five step workflow. I'm going to put the word pivotal back up in the second paragraph to emphasize the significance of the work so they also propose a pivotal five step workflow for analyzing big data. And then I think we can get rid of all of this. So, we've already alluded that this is a major contribution of the paper. We put it right up front. We've emphasized that it's significant by using the word pivotal. So, I'm just going to delete that whole thing. I don't think we need anything out of that. So, we're going to say the authors, how about, the authors propose a general five step approach. I'm going to give the idea of general in there that it can be used across many different problems for big data analytics. So, the author here originally had like here are the five steps suggested by the authors. But I'm going to flip that around and say, the authors propose these five steps and then list the steps. So, I'm not to have the, authors propose them rather then the steps were suggested by the authors. So, we can get rid of that little bit. And now we have this list. So the authors propose a general five step approach for big data analytics, colon here are the steps. So, acquiring data, choosing the right architecture for analyzing the acquired data, feeding the data for the chosen architecture, coding and bugging and fine tuning. There's the five steps. So, notice how we've just streamlined this paragraph a little bit. This five step process repeats itself as many times as necessary until meaningful results are obtained. Now, I'm actually going to put a passive voice here, is repeated this five step process is repeated the reason is that, I don't think that the process repeats itself. Is it self sustaining? I'm not sure. It kind of seems like there must be somebody involved in repeating those steps. So, maybe the author should clarify a little here. They could put this in the active voice the scientist repeats the five step process as many times as necessary. I'm going to leave it in the passive voice because I'm not sure when the subject is there. But I think unless the five step process is really self sustaining, we have to assume that a scientist is involved. And then, we get the paper cautions, the skill gap in bringing the right proportion of scientific flavor in models created by business users. I didn't quite understand that last sentence. So I'm kind of guessing here but what I'm thinking the author here meant to say is, the paper cautions that many business users have some skill gaps? Currently lacks, like many businesses wouldn't be able to do these five steps, currently lack many of the skills to perform this workflow something like that. So there's like a skill gap here. I've been getting that idea so I think that the paper's cautioning that many businesses usually may not have all the skills that are needed. And I'm wondering if the author here can add something they propose you know what to address this skill gap. So this is a review paper. I'm assuming that the authors of this review paper maybe you know said something about how we might go about addressing that skill gap. So, I might suggest that the authors add something in there they propose something XX to address this skill gap. What did they propose to address the skill gap? Now, I want to point one thing out here, we get in the very introductory then this second paragraph here which introduces the review paper, we are told that the authors discuss the definition of big data, contemporary ways to analyze data, these challenges and they also propose this pivotal five step work flow. So in the body of this paper we get the five step workflow but we don't get any of this other things the definition of big data, the challenges etc. So I'm going to actually ask the author to add a paragraph here that addresses the things that we've been told are in this review paper. So, tell me something about, maybe you don't need to define big data but tell me something about the challenges and the contemporary ways of analyzing data. Have a paragraph that addresses that because we've been set up to be told that the review paper contains this, so give me something about that and then give me the five step approach. So we need another paragraph here that adds those details. And then the final paragraph is reading really nicely. I'm going to just tweak a few things here. So what we get this idea of the significance of big data analytics. So again the author returns it to the practical applications. So, this is really nice. I'm just going to change a few words here so we can just jump right in, the potential. I think the potential of big data analytics is sufficient here or the potential of big data analytics. I'm going to say is vast and then a semi-colon for example, companies can, this is companies mostly. Companies can design more user friendly interfaces, you need a hyphen there, in which customer experience by analyzing the ways customers use the product and understand health care spending and then again these are all examples so we don't really need the etc there that's a little bit too informal. So the potential big data analytics is vast. For example here's all the things companies can do with these data. And then we get this nice thought at the end, the limitation is only our human ability to think creatively and harness the exploding world of data that's really nice language. The only little thing I'm going to tweak here is that there's something slightly funny about this because it says, the limitation is our ability to think creatively. So, it's not our ability to think creatively that's the limitation here, it's sort of the bounce, the limits of our ability to think creatively. So, this was a little tricky to set up. So I rearranged the sentence just slightly and probably the author in the revision can even come up with a better way of putting this. But you have to change this just slightly so it doesn't sound like that human creativity is the limit. So, I change it to, I move things around in harnessing. So, I move this in harnessing the exploding world of data. I move that to the beginning of the sentence to set this up. In harnessing the exploding world of data. Again, that's a really nice active descriptive clause there. Something like we are constrained because you don't want to use the word limit twice, so we are constrained only by the limits of our human ability to think creatively something like that. And I think again you can even probably do a little better in improving that last sentence but it had a really nice idea and it just needs to be brought out a little bit more. So we get our final version here. I'll just move that down a little bit. I would ask the author, now we've got the kind of this nice opening that says exactly how data analytics is being used, give something nice more examples, I would ask the author to fill in a few things about exactly how long here to be as specific as possible, to tell me what the implications of this parallel is. To add a paragraph that gets into this list of things that they said that the authors tackled in the review article. Maybe a little bit about how if the authors proposed a way to address this skill gap. And other than that I think it's really really well now. It started out really well and I've just moved a few things around to bring out a few things. There's a few more details we need but I think the author did a great job on this one and it's reading really well now.