(Music) In this video, we’ll discuss the steps you’ll take in the lab within this module. Since you’ll be taking these steps while taking the labs, you can simply focus on watching this video now, rather than trying to recreate the steps as you watch. The goal of Lab 2, is getting you to become more familiar with Watson Discovery collections as well as creating a collection for Coursera’s course content. Let’s start with the first goal. From our IBM Cloud dashboard, we’ll want to click on Services to show the list of services we have already instantiated. Find your Discovery service in the list, and click on its name. If you don’t have a Discovery instance in your account, go back to the Lab 1 in the first module of this course and follow the instructions to create one. Next, we’ll need to launch the interface for the Discovery instance by clicking on the Launch Watson Discovery button. The interface will show us a default News collection that was automatically created for us, as well as two buttons to create collections from other sources or data we have on our local drive. We can start by exploring the News collection. We’ll click on it's the Watson Discovery News tile to open the collection. This collection is automatically added and managed for us. It includes millions of recent news articles from a multitude of publications in a variety of language. For example, my collection shows that 300,000 new articles per day are added in English alone. These are articles are not just raw data. Watson analyzes them and automatically adds useful metadata to classify the information within so that we can query the huge data set. In fact, to get started right away, we are given a few pre-built queries on the right side of the screen. Let’s see what they look like by clicking Run under the first query about AI company acquisitions. On the left side, you’ll notice that this query was defined through the Discovery Query Language. Other options include natural language and a handy visual mode. enchrich_text.concepts.text:”artificial intelligence” is the Discovery Query Language way of saying that we want all the documents that Watson has identified as pertaining to the concept of artificial intelligence. However, we don’t want all the possible results. We want to filter the query down to company acquisitions. So you’ll notice that the query filters the documents in the bottom left of the screen. What the Discovery Query Language used here is asking is to filter all the articles about AI we already selected to those that include the action of acquiring the particularly any entity of type Company. Essentially, we are filtering for AI articles that discuss acquisitions of companies. In the results on the right, you’ll notice that the first 10 out of 104 relevant matching documents are shown, and for each article, we get various bits of useful information, including the title, URL, text, the concepts and keywords Watson used to classify the article, and whether the sentiment in the article was positive, negative, or neutral. Think for a moment about how much power we are getting for free. We get a huge collections of news, classified by Watson, and the ability to intelligently query it. If you are a journalist, this would be an invaluable tool. Okay, but what if you are not a journalist? The good news is you get to harness the same power when it comes to your own data. In this course, we’ll develop a chatbot that can assist Coursera students with their questions. I can tell you from experience that many will enquire about course recommendations. Hardcoding each topic in the chatbot is not feasible since Coursera has a huge variety of topics. So we’ll instead rely on Discovery for such queries. But first, we’ll need to create a collection that contains course data. Let’s see how to accomplish this. From the manage data section of your Discovery, you’ll notice the two buttons we’ve previously discussed. In the lab for this module, you’ll download a subset of Coursera courses as JSON files. So here we’ll click, Upload your own data. When we do so, a pop up will ask us for a collection name. We’ll call the collection Coursera Courses, leave the default to English since our courses will mostly be in English, and click Create. Here we’ll able to click the cloud icon or drag files to upload them in our collection. During the lab exercises, you’ll download and extract a zip file containing 500 course files. Each file has the following JSON structure. The most important bits are the name of the course, the slug that we’ll need to reconstruct the URL for the course, and the description which includes the text Watson will use to figure out the course content and relevancy to the user queries. Once we click on that cloud upload icon, we’ll be prompted to select these files we downloaded. Once you selected the folder in which you extracted the zip file, you’ll be able to press control A, or command A on Mac, to select all the files within before clicking Open. The upload process will take a moment, but you’ll eventually see that 500 documents have been added to your collection. You may receive some warnings about ID being a protected key but since it doesn’t contain useful information for our courses, you can safely ignore the fact that this particular key in our JSON files wasn’t imported. You’ll also notice that there is the option to add some enrichments to our raw data, we’ll do so by clicking on the link Add enrichments on the right. Here we’ll select Description from the drop down menu, since description was the field with the most data in our documents. Next we’ll click on Add enrichment to enrich this field. From the pop up that will appear, we’ll add Keyword Extraction and Concept Tagging. There are many other options available, f or the purposes of our chatbot, our queries will simply match the relevance of courses to the user’s keywords. Once the two enrichments have been added, we’ll click on the X in the top right to close the pop up window. You’ll notice that the Description filed now has the two enrichments, and we can finally click on the Apply changes to collection button. As a sanity check, we can click on the Query section of Discovery (the magnifying glass icon on the left) and then on Search for documents. Here we could try something like “python” and click Run Query at the bottom We should see a series of relevant courses, confirming that our collection is in good shape and ready to be queried. Okay, now that you have the lay of the land, it’s your turn to take Lab 2. (Music)