Okay, we're back and we're here talking about supervised text classification but, I've just lectured for an hour and a half, so I'm going to have a little espresso from my Nespresso machine. These series of lecture notebooks were sponsored by Nespresso, I'm kidding, they weren't, but if they were, we wouldn't know, would we, no they're not. But, I just got this really cute espresso mug from my college, by the way guys, I'm an advertising professor, which means that I teach in the communication department because our advertising department is in communication but, I also teach business analytics in the MSBA program or the Master of Science in Business Analytics in the marketing specialization, so I get to be in the business school and the advertising school, love them both. Salute, that's good stuff. Are you ready to talk about Python, I'm ready to show it to you. Here's what we're going to talk about today, this is Colab, and you can see here that we're just in a web browser and I've got a full screen here for you. This is Google Colab, and the cool part about Google Colab is that, it is a web-based piece of software and we get to use it in a browser, in any browser that you like, I prefer Chrome because.Google to Google always make sense. It's robust, I mean, there's so much you can do with this environment, we're going to barely scratch the surface, but I do want you to get familiar with some of the basics of it. First, the most important thing to keep in mind with Google Colab is, like any Google product, it is associated with a Google account; so, if you have a personal Gmail account and you want to use that because that's where you want everything stored, where you want all your files and everything to be stored, do that. It doesn't matter if you use the CU account or if you use the Gmail account or whatever, it's all going to work the same for you here in this class. I'm using my Colorado account because it is really nice to be able to have unlimited storage when you are doing these things, and the University of Colorado gives at least me, I think also students storage, so what a great tool to be able to store all of your data and everything; that's stored in Google Drive. Let's take a look at Google Drive and see what it is. If you've never used Google Drive before, you know it has it's pros and it's cons, I'm not going to lie, but, I really like it overall. This is my personal Google Drive, you can see my John Mayer video loaded up there, all good. Let's look at the academic version. This is my Google Drive and you can see I have different folders here for different things that I'm doing, if you want your paths to behave just like mine in this notebook, meaning if you want to change nothing and you want the code to run, just as it is, you're going to need to make a folder in your Google Drive, that is MSDS_ marketing_ text_ analytics, just like this. If you make a folder like this, and then inside of that, if you make a sub-folder called master files, and then inside of that, if you make a folder for each one of our classes and you name it just like this, then you will have the ability to just use code notebooks for yourself just as I do in class. You may decide that you want to store your Google notebooks somewhere else, you might want to put them on another folder in your Google Drive, you'll have to change the pass in your Colab notebook to reflect that, and I encourage you to do that, it's a good practice but, if you just want things to work simply, this is the path sequence that you must create in your Google Drive to get code to work natively without much manipulation. It's a really nice tool here, Google Drive, it's showing us all of these files, of course we can upload files to Google Drive by just right clicking and clicking, upload. The number 1 way that I like to start google Colab is, I like to be intentional and go into Google Drive, find the path in the folder that I want my new Google Colab Python Notebook to be put in, and then I right-click and then I click "New Google Colab Notebook". That's going to create a new in that folder, and I will say if there is one common complaint that I hear often about Google Drive is that, it's hard to find your files, and so if you know that you're starting by making your notebook in that folder, it's going to be in there and you're not going to lose it, so I encourage you to start by making your notebook this way. Now, remember, we've shared our Colab notebooks with you for this set of lectures and for all of the lectures in the sequence. You can simply copy that notebook into this folder if you'd like to get started there. Or you can code along from scratch and work that way. It's really up to you, totally your decision as to what you want to do. I also have my computer, a little application called Google Drive sync. It is a wonderful tool that allows you to sync to your local computer, your files that are on Google Drive. You can see that I have the same folder here on my Mac that has all of the same files here reflected on my Google Drive. This is not a perfect system and takes awhile for Google Drive sync the changes that you make in Google Drive on your computer to be reflected here, that sync process takes some time and vice versa, changes that you make to your Google Drive on the Cloud, take a while to propagate down back to your computer. Sometimes lag can create issues, and it's not a perfect system, but I love the ability to be able to browse all of my Google Drive files on my local computer. If you have a big computer with a lot of memory, a lot of storage, then you can do this. I think that it's a good option. But you don't need Google Drive. You can be working straight on Google Drive on the Cloud, you don't need the OS version installed. By the way, this does work on Mac and PC, so it's available on both for you. This is it. This is the files that we're going to be working on. You don't need to copy these yet. I'm going to give you URLs to download the actual data that we're going to be working on, and we can actually jump into the project. But first, let's get started with Google Colab and just talk a little bit more about it. Again, if I want to open up a Google notebook or an ipynb, which stands for interactive python notebook, if I want to open up my python notebook in Colab, it's really easy to do. I just right-click on one, and I click, ''Preview,'' and then I click, ''Open with Colab.'' I've already done that and I've already gotten mine opened up here. Google Colab is pretty intuitive. Some important tabs, if you want to save files, rename files. If you want to move of a notebook to a specific folder, this is all done in the file tab here, or you can do those things in Google Drive natively as well. You have your common undo buttons and all this works with regular IPython notebooks do. One of the things that you really need to pay attention with here is the concept of run-time. These notebooks, unlike other Jupyter notebooks, these notebooks are being run on the Cloud by Google. That means Google is giving you the computer resources. That means they have a little computer in the Cloud that's configured for you, once you turn on Google Colab. It's good and bad. The good thing about this is you don't have to worry about local resources. Your RAM here in Google Colab is always going to be the same. Your space is always going to be dictated on how much free space you have in Google Drive. That's it. You don't have to worry about connecting to your local computer to get the runtime resources that your local computer has. That's an advantage. The downside is that a run-time goes away. Google Colab notebooks in their runtimes expire. After you haven't run any new code for a while, your computer will just turn off in the Cloud, and once it turns off, your notebook is saved, but all of the things that you've done to the Python environments such as installed new packages or configured TensorFlow to work perfectly, all of that stuff goes away. Every time you run a notebook, in Google Notebooks, it's like starting from scratch with their basic runtime environment. It has a lot of packages installed, so we don't have to install every package every time we use it. But there are some packages that every time we restart a notebook, we're going to have to re-install packages because the runtime essentially just shuts off after a while. If you ever want to see what runtimes you have available to you, you just simply click "Managed sessions". It shows you and you can see here that I have one runtime open now, and that's all what I expect, and that's good. I can terminate it here if I want. It shows you how much RAM I'm using, it tells you the last time I ran some code in this Notebook. You can always restart your runtime and that's if you're just spinning or if this doesn't seem to be connected, you can tell that your runtime is connected by this little check mark. You've got resources, your computer is on, it's up, it's active. If this is not connected, if you see three dots here, that means that you got to click it to turn the runtime back on. That often means that the runtime crashed or that the runtime shutdown, and so you're going to have to start from scratch. It can be frustrating and you invariably will have to deal with that. The biggest and easiest way to get started if your runtime does crash or something happens, you should just click "Restart" and runoff. That's going to restart your runtime, it's going to reconnect to Google Colab, and it's going to run your code. That's pretty much it. There's another thing that you really want to be aware of is changing your runtime type. We will be using deep learning, not in this Notebook, but in the next lecture and so you want to make sure that you are being deliberate in what runtime you're using. Only hardware acceleration that you're going to use in this class is GPU. Remember that I said that deep learning requires GPU. When we do deep learning, you'll want to make sure that you're runtime is configured as such to be a GPU. It'll restart your environment and you'll have that. That's pretty much it for the buttons that are up here. One of the biggest things that you'll use in this class is the actual Google Drive visualization here. Every Notebook that you download or that you create needs to be connected to your Google Drive before it can access files inside of your Google Drive. I can do that by clicking this little button that says mount drive. Every Notebook is different. If you make a Notebook yourself and you click this "Mount" button, it's just going to give you a little prompt that says, are you sure, if you say yes, then your Google Drive will show up here. If you use a Notebook that someone else authored, you're going to need to do this mount process. It's running a piece of code and you're going to get a connect, and then you're going to get a little pop-up here, and you're going to say, yeah. Then when you hit "Allow", you'll see that your Drive will become mounted here on the left. That's something that will look a little different depending on whether you authored a Notebook or not. We'll click the little "Refresh" button here, and now I can see that I have a Drive mounted, so I click "Drive", I click "MyDrive" and I've got all of my files from all of my work here at the University of Colorado. Again, I'm working inside of the MSDS program and I'm looking inside of master files and I'm looking for text classification. I have saved all of my Notebooks right inside of that folder. That's all you really need to do if you wanted to copy the path to this folder so that you can reference it for anything to open a file or to search for files. All you have to do is click "Copy Path". You click on the folder that you'd like and then you click on "Copy Path". If you see here, if I just make a new code block and I just paste it, you'll see that I have the full Google path as Google Drive has its stored. That's really the easiest way to figure like what path I'm I on and how do I open up that path? It's really lovely. We're going to be using Google Drive for all of our files. Everything's going to be sync there. We're not going to be referencing anything locally. That's really hard to do. Google Colab, it's possible but it's difficult. Pretty much everything has to live and breathe and Google Drive for this thing to work. If you're not sure where your files are, you can always open up this panel on the side to see what's going on and when you're not using it, click it again to minimize it to get a lot of your screen back. We won't use this specifically, but it's really cool in the bottom left, we actually have access to a terminal. If we want to type in a terminal command to see what's going on, we can do that. It is an active terminal that is basically a Unix operating system here on the back-end. It's really nice if you want to run a piece of code or install some package, and you don't want it to be inside of your Notebook for whatever reason, you can manipulate the terminal in that way as well. Although we will not need that for this class, it's there for you. It's like a little computer in the Cloud. Really, Google has done such a great job of providing such a cool resource. Some other practical things is if you want to download this, you can download it as multiple formats. We've got the ability to download it as a Python Notebook or a Python file. If we want to run it locally on our computer, just in command line or in terminal, we can download the.py. If we want to download the notebook so that we can use the notebook somewhere else, like in Jupyter Notebooks, we can download the IPython Notebook. It's a really nice tool. Finally, we can share files with people. I can see here that I've already shared this with my co-author of the course, Scott Bradley at Northwestern. I haven't really introduced Scott. Scott is a researcher in the Journalism and Communication area of the Medill School at Northwestern. We're really lucky to have him being the computational backbone behind a lot of the auto-grading that you see in the course. Scott's got access to this Notebook. We've got another CU person. We can add people and of course, we can actually just give people a link. If we wanted to give a link that anyone could open, we could change this to anyone and then this link, if I copy it to the clipboard, anyone can open up in CR code. If they have a Google account, they can actually go in and run the code, make changes, share it with us. That's what makes Google Colab so special and I think that's where it gets its name. It's a tool that's used to collaborate with multiple people and it really does a fairly good job of doing that. That's Google Colab in a nutshell and that's something that I think you guys if you haven't used this already, you're going to love it. If you have used it, this is going to be a no-brainer but let's just dive into some of the basics of Google Colab. Remember, the first two things that you want to do when you're in Colab, you want to mount your drive because you're going to want to access to your files, and then you're going to want to check your runtime to make sure it's appropriate. For today we're not doing any actual deep learning, so we're not going to need a GPU, but we can see that our standard, no accelerator standard RAM is good to go. You can see here that we're connected because we see that we have the little "Play" button here. You can see that Python code just really runs as simply as it would if you were using a Notebook. We won't go into this in detail, but we can use LaTeX. We can move cells up and down, so if we wanted to move this cell up, we just select it and click up, or if we wanted to move it back down, we can move it back down. We can delete cells by clicking the little trash can. We can make comments on cells if we wanted to comment to people, "Hey, Scott. Come in here and fix this thing it's broken for me." We can copy the contents of cells. One of the cool things about Colab is if you just select a cell or a series of cells, you can then go "Edit", "Copy" cell. Once you copy that, you can actually open up a new Google Colab notebook and paste it, and it's going to paste in perfectly. That's nice. You can see here that all of our regular Python packages are going to work. We can browse our system files just like we could here on the left with our file folder. We can install new packages by using the exclamation point. If we wanted to install Facebook Prophet, which is a really cool tool for time series analysis, we can do that in just one line of code. That's something that we can really keep in mind. You can see we actually can get autosuggestion so I'm going to import NumPy here. I'm going to say np., and you can see all of the packages and modules are natively loaded here so we can see what we can actually call inside of these packages that we've installed. If I want random, I can do that and so that really is all there is to that. It really shows you a nice auto-completion function as well. We've got different types and you can explore what types are and so if you say, "What is type 1?", this stuff that you should know by now but Google Colab is really good at all of these basic Python things. We can concatenate strings and we can do every single thing that we could in Python. That's really what Google Colab is and there's really not too much else to it. I'm sure it'll give you headaches at points, people often complain about paths and not getting the right paths, and so remember if you can't find where your files are, use this paths tab, it's going to be really helpful for you.